.MPER5AL Institute 
OF 

Agricultural Research, Pusa. 



VoL V 


THE ANNALS 

of 

MATHEMATICAL 

STATISTICS 


IPublished and Lithoprinted hy 

EDWARDS BROTHERS, INC. 
. ' / , ANN AUBOE, MICH. 



THE ANNALS OF MATHEMATICAL STATISTICS is 
afiiliated with the American Statistical Association and is devoted 
to the theory and application of iMathenuitical Statistics. 


Published Quarterly: Tilarch, June, September, December 


Four Dollars per aiimnu 


PL C. CARVER, Editor 

A. L. O'TOOLE, Associate Editor 


The Aiuwls is not copyrighted : any articles or tables appearing therein way 
be re produced in tehale or in part al any time If acconipanied by 
the proper referenee to this publication 


Address: Annals of Mathematical Statistics 
Post Office Box 171, Ann Arbor, Miclii<>’an 



CONTENTS OF VOLUME V 


The Accuracy of Computation with Approximate Numbers 1 
Helen M. Walker and Vera Sanford 

Combining Two Prohal^ility Functions , 13 

Willmn Dotacll Baten 

On the Systematic Fitting of Straight Line Trends by Sten- 
cil and Calculating Machine 21 

Herbert A. Toops 

Statistical Analysis of One-Dimensional Distributions . . 30 

Robert Schmidt, Kiel, Germany 

Editorial : A New Type of Average for Security Prices . . 73 

H. C, Carver 

On a New Method of Computing Non-Linear Regression 

Curves 81 

Walter Andersson, Lund, Sweden 

The Standard Error of Any Analytic Function of a Set of 

Parameters Evaluated by the Method of Least Squares 107 
IV alter A. Hendricks 

Transformation of Non-Normal Frecjuency Distributions into 

Normal Distributions . . . . 113 

(7. A, Baker 

Invariants and Covariants of Certain Fre(|uency Curves . .124 

RieJnnond 77 Zoch 

Quadrature of the Normal Cinwe 136 

E. R. Enloz^: 

Editorial : On a Best Value of R in Sample of R from a 

Finite E^opulation of V 146 

A. L. OH'oole 

Editorial : Punched Card vSysteins and Statistics ... . 153 

/7. 67 Carver 



CONTENTS OF VOLUME V-Continued 


The Method of Path Coefficients , 161, 

Snmll I'Vriuht 


Mathematical Foundation for a Method of Statistical Analy- 
sis of Household Budgets 216 

John PF. Boldyreff 


On the Relative Stability of the Median and Arithmetic Mean, 
with Particular Reference to Certain Frequency Distribu- 
tions which can 1)e Dissected intO' Normal Distributions . 227 
Harry S. Pollard 


An Application of Characteristic Functions to the Distribution 


Problem of Statistics 263 

Solomon Kidlhack 

On Measures of Contingency 30<S 

Frank M. Weida 


Note on Koshal’s Method of Improving the Parameters of 
Curves by the Use of the Method of Maximum Likeli- 
hood 32 Q 

R. /. Myers 


The Adequacy of ‘"Student^s” Criterion of Deviations in 

Small Sample Means 324 

Alan E. Treloar and Marian A. Wilder 



THE ACCURACY OF COMPUTATION WITH 
APPROXIMATE NUMBERS 

By 

Helen M. Walker 
Teachers College, Columbia University 

attd 

Vera Sanford 

Oneonta State Normal School. 

L General Considerations. 

The number of figures necessarily free from error in the result 
of a piece of computation may be determined by studying the rela- 
tion between the number of digits in the result and the number of 
digits in the maximum error of the computation. It is the purpose 
of this essay to derive rules for the determination of the number 
of digits which are certain to be correct in computations based on 
measurement, but it must be understood that these rules state the 
minimum number of correct digits so that the result of a specific 
piece of computation may be accurate for more places than the 
rules indicate. “ 

2. Notation. 

Since the location of the decimal point has no connection with 
significant figures in a given number, it is assumed that the decimal 
point follows the last significant figure in each of the original num- 
bers, the argument being somewhat simplified by this assumption. 
Accordingly the greatest error in the statement of the original 
numbers is i 0,5. Let. 4 and B be the true values of two num- 
bers .such that 4 ~ a- /o'^and 3^ ; where 7n and are 

positive integers and where 0.1 < 1.0 and where 0.1 K 1 . 0 . 

Then by the convention adopted above, the number of significant 
figures in A ‘and S , are m and tv respectively, and the observed 
values are not less than ' A ^ 0.5 and 3 0.5 and not more than 



2 


ACCURACY OF COMPUTATION 


A f' 0.5 and B t'O.S.' Let 6 represent the itiaximtim error in the 
computation and let be the value of the largest term in the ex- 
pansion of ^ . 

3. Prodiicts, 

The greatest error in the product of h and 3 will occur when 
each is in excess by 0.5, the value of this error being 

For 0.1, has digits to the left of the decimal point. 

For has digits to the left of the dedmal 

point. The cases to be considered are 


( I) 

TTt' = 



f 

( II) 

Tft = 

rt 

> 

/ 

(III) 

7TZ — 

?t 


/ 

(IV) 

7?7 ~ 

n . 

> 

/ 


(I) Let Tft = In this extreme case each factor con- 

sists of a single digit and the product consists of one or two digits. 
In' this case, the figure in the unites place is always' affected by the 
maximum error and ’the figure in the ten’s place, when' present, is 
generally so affected. 

(II) Let Here 

g « (aL-¥-6r)- ^ ,25~ . 

But 0.1 4 1.0 

and therefore ,10^ ' -t ^ 4. . 

The following conditions are possible': 

Either ’(I) and A3 has,.^;f digits to the left of the 

dedmal point, 

or ■ i2} A'ir^ OA and AS has digits to the left of 

the decimal point.' 

; Either ,di^ts to the left 'of the ^ point, ' 

;or;, {4), ^has- /digits'^ to the’teft of 'the point the 
v;^ ' first;otie .being'l and 'all the: others to' the^ left' ol 

^ ' 'The dedm^'point Immztm, 'This cap ocenr only; 



HELEN M. WALKER AND VERA SANFORD 


3 


when e is Tery near its maximuni value. For 

example, when 7X=4, the value of e must be less 
than 1(XXX).25. 

Then if conditions (1) and (3) are fulfilled, the result has 'Th 
more digits to the left of the decimal point than has the error, A 
subsequent proof will show that this means that at least n-i places 
in the result are free from error. Under conditions (1) and (4), 
the difference in the 'number of digits is n-f , and at least n-^z 
are not affected by the error. Similarly under conditions (2) and 
(3), Ti-Z digits are not affected by the error. Conditions (2) 
and (4) cannot occur simultaneously. 



The proof that conditions (2) and (4) are incompatible with 
the conditions that 0.1 < < 1.0 and 0.1 i ^ < 1.0 may be obtained 

from fig, 1. The area within which these limits .hold for 
and for ^ is the area of the square bounded by, a =.L0, a,® 0.1^ 
/-==.L0, *^=0.1, the numerical value of this area being 0.81. The' 
region within which ^^#<0.1 is the area below the h5rperbola '€i##0.1. 
The rc^on within which is 'larger than' a specified value'' is 



ACCURACY OF COMPUTATION 


the region above the line a-h^-4,- Therefore the probability that 
all of these conditions shaii be simultaneously fulfilled is the ratio 
of the shaded region in fig. 1 to the total area of the square, or 
to 0.81. When ^ 5, this probability is only 0. OCX), 014, 8 and 
when >0.55, the probability vanishes altogether. 

(Ill) L-et m-n'l. Then €:- ■ Z5 . 

But t. / ^ t A tx, i"0- -C f J ■ 

Now either ?2=1 or 77>1. 

Let J?=l. Then5.75X€ < 55.25. 

Either (1) 0.1 and AB has 3 digits to the left of the deci- 

mal point, 

or (2) 0.1 and AM has 2 digits to the left of the dec- 

' imal point. 

Either (3) 10a f# < 1.95 and 5.75 < 6 < 10, 
or (4) 1.95 and 10^6 < 5525. 

If conditions (1) and (3) are met, the product has 2 more 
digits to the left of the decimal point than has the error. Thus one 
or two places will in general be free from error. Under conditions 
(1) and (4) or conditions (2) and (3) the number of such places 
free from error is 0 or 1. By an analysis similar to that given 
under (11) it appears that theffe is about one chance in ten that 
conditions (2) and (4) should be simultaneously met, in which 
case no {dace would be free from error. 

Let 7Z = 2. Then 55.25 <6. < 550.25. 

Either (1) ^0.1 and AB has 5 digits to the left of the deci- 

mal point,' 

or (2) aS^O.l and AB has 4 digits to the left of the deci- 
. ' mal' 'point. ; 

Either (3) lOaf tf- C 1.995 and 55.25 < e < 100. 
or (4) > 1.995 and l00 4 < 550.25. 

if cohditious (1) and (3) are met, the product has either 2 
or 3 places frlle from erroir. Under conditions (2) and (3) or 



HELEN M. WALKER AND VERA SANFORD 


5 


conditions (1) and (4), the product has 1 or 2 places free from 
error. Simultaneous fulfilment of conditions (2) and (4) will be 
rare but not impossible. However in this case, the first digit of the 
error cannot be larger than 5, hence, as shown later, the number 
of digits free from error in the result will usually be the number 
ill the product minus the number in the error, rather than one less 
than that. 

For n >2, the constant 0.25 forms a still smaller proportion 
of the error. Hence for larger values of tz , if the pro- 

duct may be expected to have tz or 7 t-l places free from error, 
(IV) Let 7n-72>l. Here 7Zi 5^3 and therefore the terms 
-L and 0.25 are negligible in comparison with a.* and 

may be disregarded, since neither of them can aflfect the first place 
in the error. Then a* and therefore 

0.5 (lO"”'' ) < 0.5 (10 ^ ). 

Either (1) <5£tr^0.1 and has W-/-7Z places to the left of the 
decimal point, 

or (2) <3'ir<'0.1 and A3 has mi-'n-l places to the left of 
the decimal point. 

Either (3) «a-<.2 and < 10 so that e has 7^-1 places to the 
left of the decimal point, 

or (4) ^ 0.5 (10^), so that € has 'ttv 

places to the left of the decimal point. 

Conditions (1) and (3) would leave either 77*^1 or places 
free from error in the product. Conditions (2) and (3) or con- 
ditions ( 1 ) and (4) would leave either n or tzH places free from 
error. If conditions (2) and (4) are met, there would be 
places in the product to the left of the first digit in the error, and 
since this first digit is not more than 5, the error is not likely to 
affect the preceding digit. See section 6. 

In general, therefore, if there are n significant figures in the 
lessmcurate of Hm approxirmtims, the product of theiwo approx- 
iinoMms mil ham^ ?t'or rtl digits free from, error. The' product '-of 



6 


ACCURACY OF COMPUTATION 


itm suck numbers shmdd be romtded off until it coniakts oidy as 
many sipiificant figures as the less accurate of the two mmiiers. 
The Iasi iigit m the product may then contain some error. 

4« Quotients. 

The greatest error which can occur in a quotient arises wlien 
the dividend is in excess by 0.5 and the divisor in defect by the 
same amount, 

77^*“ TU 

A-h.5 A S-f-a-io 
Then 6 — — — ■ :: 

■6- -5 3 - 

We must consider separately the cases (I) m= ri- 

' (II) m.'yri 

(III) 7n {yl 

(I) let 171^ n . Then 

(d) If = there are 81 possible quotients of one-place 
numbers, and an examination of these shows that in only thirty-one 
of these cases the first digit is free from error. 

The case for in -rty\ should be studied for specific 
values of 72 • In general, however, the error is large as a. 1 and 
0.1, and is small as ^-^0.1 and ^”4 LO. Also as tl increases, 
the influence of the constant term in the denominator becomes less. 


Therefore In. general 

0.SS , I * < £/ FL- - 

lo ^ ^ atlaz(ftA')~/J /0 

If ti — ?1.0, ^-7 0.1 and ^tis large, then 


and the 


error has at least n~2 zeros to the left of the first digit. In 


this case, ^ has one digit to the left of the decimal point, so that 
the quotient will have at least n-j digits to the left of the first 


digit in the error. 

If a~?0,l, -«$•-> 1.0, and is large, then €y 
error has zeros to the left of the first digit In this case ^ has 
no di^ts to the left of the decimal, so that the quotient will again 
,l^\e ; ''7i"i";djgits-to'. thelelt’Of the.'.first. digit' iftthe error. , 



HELEN M. WALKER AND VERA SANFORD 7 

Furthermore, €= ( 10 - f) , o/? 

f CL -Tx- 

zs""- jc-^ higher powers of 10 . 

Then if a. > - 5 — « ^ . We then have 1 digit 

to the left of the decimal point in the qiiotient, and either te ~ 1 or 
?/ “* 2 zeros preceding the first digit in the error. 

If <r T~f7^ 

since 0.1 "5^ 10. 

In this case we have no digits to the left of the decimal point in 
the quotient, and either n or zeros preceding' the first digit 

in the quotient. 

(II) Let myit, 

(a) Let 

V ^ = (fi- , the higher powers 

of C ^ having no effect upon the first 

digit in the error. 

If €1^ ir , there are 2 digits to the left of the decimal point 
in the quotient and either v7-2 or 77—3 zeros in the error. 

If , there is 1 digit to the left of the point in the quotient 

and either 77- 1 or 77—2 zeros in the error. 

Only in rare cases will there be as few as tt-'S zeros in front 
of the first digit in the error. To secure this € must be greater than 
10 . This probability will differ for different values of n . For 

example, if 7 f »4, we have as bounding conditions, 

4X > 20 o,io( 4- 

The ratio of the area bounded by '#* */ , and 

<t=r to the area of the square bounded by 

#=^,7 , and ' ■' '.‘p 



8 


ACCURACY OF COMPUTATION 


which is the probability that there would be only - 3 zeros 
foilowsiig the decitnal point in the error. 


(b) Let 


-rn-ityi. Then 


e c 


TK-n 


•S't- a- io 


This sitaation should be studied for specific values of n 
However an approximation may be obtained by letting 
tr S — 

since subsequent terms in the expansion do not affect the 'first digit. 

-m-ytH Tv-In H 

If then € 4 other terms which do 

not affect the first digit. ■ 

-m-n. vt-zn. 

Also ^ ^ 


zS-.fo'^ 


2-g- 


> osClO 


W-Z-W. \ 


J- 


In this case the quotient has m-Ti-kJ digits to the left of the 
decimal point, while the error 'has either rn-zrif m-zn-pl or 
rn-ZnFZ, Consequently there are either w-l, , or Tt-f 1 


digits free from error in the result. 

It y 


, then 


it /a 






In this case the quotient has m-n digits to the left of the decimal 
point, while the error has either ot 

Again there are either' ?/-/ ,77, or 7 ? f-1 digits free from error 
in the result. 


(Ill) Let min. 

Suppose ^ in t i.-si tv ^ 

If o-^^, the first digit of the quotient is 'imme- 
diately to the right of the decimal point, while there are from m-l 
zeros between the point and the 'first digit in the error. 

If , t there is one zero between the decimal point and the 
first digit of the quotient, while in the error there are either, m or 
'W"#*l'zeros. , , , '' ' : , , 

, :jn::§emrS, .here 4^1^ in tke'iess' relk^le 

$tm\ ufiprQxhmti&m:,:there; mUke .or' 'fztl Ife 



HELEN M, WALKER AND VERA SANFQRD 


9 


free frvm error in their quotient. In a few rare cases, a fortuitous 
combination of digits, discussed later, may throw the error back 
into the ri-2 place. In general the quotient should be rounded off 
to contain only as many places as there are in the less accurate of 
the hw numbers. 


5. Square Root, 

When a number is in excess by 0,5, the error in its square 
root is (A t Vz) - A 


iL. ix \ A"?.. i.3.5-ax-^} 


- 'Si. Pi + '/gs Pi i-tO 


2 ®^ k! 


When a number is in defect by 0.5, the error in its square 
root is - '/H - P 


= 'UP'"---- - 


-■■.ax-3) 

k! ^ 


Obviously the greater error occurs when the number is in de- 
fect by 0.5, but in either case we may neglect all terms after the 
first. Each term can readily be shown to be larger than the term 
following it, and the ratio of the first term to the second is so large 
that the second term cannot affect the first digit in the error. 

We must consider in turn the case in which m- is even and 
that in which 7n.,is odd. 

(I) Let ZAO, 

Then A^ a ' ( and has Z/t digits to the left of the decimal 
point. and has A digits to the left of the decimal. 

Now )6'| = . But in ■ 

Since 0.25 (id' 4 0.791 (10 the error has zeros 

between its first digit and the decimal point. 

Therefore when trtis even, the root contains as many significant 
figures as the^ number. 

(II) Let' .'y 



10 


ACCURACY OF COMPUTATION 


Then A - 


Then 


2JI-I 

a’ io ^ in a. 
ft ^ {fSn ^ . W and has A- 


ii) 




didts to the left of the deciinaL 

]€’l = 'A A ■■ 


io 




//z. 


0079 < ^ = 0.25 

(.079) 10 < |e'| < 4..Z5 (lo""^) 


td- 

The error then will affect the JL ^ place to the right of the decimal 
point, and the number of digits free from error will be Atft-I ^ZA-1 
which vras the iiuniber of places in the square. 

(Ill) . There is also the case where the decimal point is so placed 
that t he second digit in the last period fs not known, as in |/3^ 
or y' 0.46825. 

Here \€'l = A .In this case also the number of 

digits free from error in the root is the number of digits in the 
original number. 

In genera^ then, the number of digits free from> error in the 
square root of a number is the number of digits in the nmniber. 

6. Effect of the Error. 

The following table will illustrate how an error of n places may 
affect either n or places in the computation : 





ERKOIi IN DEFECT 

Result obtained by computation 

“6247 

S9S6 

7253 

6247 

5986 

7253 

Error 

33 

53 

12 

33 

53 

12 

True value 

6214 

5933 

7241 

6280 

6039 

7265 

Computed value, rounded 

6200 

6000 

7300 

6200 


7300 

True value, rounded — 

6200 

5900 

7200 ^ 

6300 

B 9 

7300 


■ We will now show that the chances are approximately 3 out of 4 
that an error oI'TZ digits aflPectsTzand not Tt-el places in ' the result. 
'.Far conyenience we may place 'the ■decimal point to the left pf the 
.first 'digit in the .error, the' position oFthe decimal point '.being" en- 

: tfeely. independent' of dhe'nu^^ of significant figures .in' the, com-' 

putetiom ’ ' ' 










HELEN M. WALKER AND VERA SANFORD 


11 


Net € = error. 

{£, == portion of the number to the right of the decimal point. 
C = portion of the number to the left of the decimal point. 
/\ = the true value of the number. 

Then c't cL = result of computation. 

4 = CA cl.- € . ~ true value. 

We will consider to be positive when the observed value is in 
excess and negative when it is in defect. 

We will consider separately the case where the computed value is 
in excess and the case where it is in defect. 

Suppose the result of computation to be in excess 


L (a) Then if >.5 and 6)^ c/-. 5 | 
(b) or and j 

2. (a) If > .Sand 6<c^-.5 ) 

(b) or d I .Sand .S J 

3. (a) If ) .5 and 6 =: .5 

(b) or d. I .Sand € 

(c) or r .5 and 



the error tmll affect Tif-l 
places in the result. 

the error will affect only it 
places in the result. 

the error will affect either 
TL or -TL-tl places depend- 
ing on whether the last 
digit of C is odd or even. 
This is on the assumption 
of the usual rule, that in 
rounding off the digit 5 
the previous digit is made 
even. 


Since the number of digits in e is finite, the values of d and 
of € form discrete series, so that we shall have to think of d-S 
not as an infinitesimal but as a finite portion of the scale, ranging 
from 495 toi^=,.505 when 77=2, from <c/«.4995 to^ = .5(X)S 
when 7C-3, etc. If we map the region bounded by 0, ds l; c* 0, 
1, the' proportions' of ^ area representing conditions (1),’ (2), 
and (3) represent the respective probabilities of these three a'ctsOof' 



12 


ACCURACY OF COMPUTATION 


. conditions. As n increases, the width of the strip d- .5 becomes 
smaller, the probability of (3) becomes smaller, and the probability 
of (2) approaches . 

When -^=2 “these areas are respectively 


1 (a.) and 1 {S-) ■ 245025 

2 (^) -and 2 (i^) 735075 

3 (^),3 (J'),and3 (e) .0199 


1.00(X)000 

\¥e may assume that the last digit in C is as likely to be even 
as to be odd, we may say that the probability that the error will 
affect n.-Pl places in the result is slightly more than dd when 
there are two digits in 6 . This ratio will approach dd 
number of digits in € increases. 

A similar argument holds when the result of computation is 
in defect. 

7. Summary of Rules, 

On the assumption that an error of u places affects only 7t^ 
places in the result we have the following rules : 

If 'the less accurate of two approximate numbers contains Tt 
significant digits/ their product and their . quotient each contain 
or 7I-1 significant digits. 

The square root of a number contains as many significant fig- 
ures as the number. 

About once in four times, the error will affect one more place 
than these rules state, for the reasons given in section 6. 



COMBINING TWO PROBABILITY FUNCTIONS 

William Dowell Baten, 

U niversity , of M i chi gan. 

The object of this paper is to show results which arise from 
combining two probability functions in finding the probability 
function for the sum of two independent variables. The first part 
presents the sum function when the probability law for each indi- 
vidual variable is -"one-half” of the Pearson Type X law. From 
this law arise certain ideas concerning the Beta function which are 
not presented by texts treating this subject. 

The second part presents some peculiar probability functions 
when special laws for the individual variables are considered. Here 
certain laws with infinite discontinuities are combined. 

I. The probability function for the sum of n variables when 
each is subject to the function e . 

Let the probability that the chance variable lies in the in- 
terval (x^ ^ be to within infinitesimals of higher order 

S(x,) dXf and the probability, that the chance variable X^ Hes in 
the interval be to within infinitesimals of higher order 

^(y^) j where and may have respectively any real value. 

By a well known theorem, the probability that the sum, 

^ lies in the interval is, to within infinitesi- 

mals of higher order, 

*50 

Let -oo 

/('7^)=e ^ for i.Oy'o) 

^ O elsewhere, 

and . . ' ' 



44 


COMBINING TWO PROBABILITY FUNCTIONS 


According to the above theorem, the probability function for the 
Slim, ^ is 

= J e e d)c 

r 2r e for 

^ o elsewhere, 

which is a Pearson Type III function. The probability functions 
or laws for X^ and X^are discontinuous at the origin, while the law 
for the sum is continuous from minus infinity to plus infinity. 

By using ^ fhe frequency function for the 

sum, X; ^ >3 2- ^ is 

-^2-X) . 

X e e dx 

^e% for (o^^) 

^ O 4 elsewhere ; where 

In general, if the probability function for the individual vari- 
able is 

c: o , elsewhere, 

then the probability function for the sum, ^ = 2* 

is 

~ C for (o^^) 

=: o elsewhere. 

This is also a Type HI law. Others have studied this law and have 
obtained functions for the sum and the average.^ 

^ Mayr— Wahrscheinlichkeitsfunktionen , and ibre Anwenciungcn-Mo- 
natsliefte 'far 'Math, uad Phys., Vol. 30, 1920. p. 20. 

Charch—On the mean and squared standard deviation of small samples 
from any .populaticm^ — Bionietrika, Vol’ 18, 1926. , pp. -^21 -394. 

' Irwin— On the frequency distribution of the means of samples from a 
popuiation having, any , law of frequency with', finite moments, with special 
reference'to Pearson Type II— Biometrika, Vol 19, 1927. pp. 225-239, 

' ,C; C.:'Cmg— Sampling when the. parent 'population is a Pearson Type 
III — Biwnetrika, ,VoL '21, '1929. pp.' 287-293.' 

A. T. Ctaig—On 'the, distribution. 'o'f certain 'StUti^tics— Ann Jour 'of 
Math. ¥0l ,54j.,l4q.'„2;. 1 932., 353.366.. , • '' ' '''' ' , . 

Baten— Pteqiiency laws .for the., siitn hi . va .' ' ■ ' ...i,,: ^ 


f;(^) 



WILLIAM DOWELL EATEN 


15 


The purpose of developing this law for the sum of n indepen- 
dent variables is to show how certain finite stimmations are eval- 
uated. An interesting summation arises when / and ^ are inter- 
changed in certain cases. For example the law for the sum, 

IS , and the law for the sum. 



- O ^ elsewhere. 


Since the probability function for the sum of the first ni- / varia- 
bles, when each is subject to^ , is 




'Tl. / 

^ e A/ 


, for the positive axis 


then (a) and (b) are equal and the summation in the above ex- 
pression for (a) is equal to ; hence 

If the probability function for the sum of the first zn varia- 
bles is obtained by “combining’’ the' probability function for the 
sum of the first ri variables with the probability function for the 
sum of the following 7^ variables, another interesting sunimation 
arises* This summation is a Beta function in disgitise. For exam- 
ple the probability function for the sum, ^ 

for positive ^ and zero elsewhere, and the prolmbility 
function for the sum, V , is v^€ /s{' for 

i>ositive V and zero elsewhere. The probability function for the 
M- 

sum, 24* V- 


% C ^ ; fordhc' positive ^axis' 



16 


COMBINING TWO PROBABILITY FUNCTIONS 


and zero elsewhere. The quantity in parentheses has f(3r numera- 
tors the coefficients of the binomial , while the denoiiiitiators 

begin with a number greater by one than the exponent of the bi- 
nomial and increase by unity from term to term. The atove form 
suggests the following integral 

i ^ " B 

In general the probability function for the sum of the first zn 
variables, by using the probability function for the sum of the first ^ 
7 L and the probability function for the sum of the following n , is 

Cjt, 


• 277 -/ yj”! 

w e 


, for ( 0 ^*^) and zero elsewhere* 


The summation can be written as a definite integral 

If the probability law for the sum of n independent variables 
is obtained by combining the probability law for the sum of the 
first 5 variables with the law for the sum of the following n-5 
variables the following summation arises which is also equal to»a 
Beta function. This summation is 




, 4 . 






J^X 0-)<-) dy: 3 ( Sj -TT-s). 


This idea concerning the Beta function appears to be new. 

II. Combining two probability functions. 

Combining here shall mean finding the probability function 
for the sum of the variables from the probability functions of the 
individual variables. Many peculiar functions arise, when various 
laws are used for the probability functions of the individual varia- 
bles. This section presents a few of them. 

.•'Bet , ' 

/{it / (0,1) and zero elsewhere, 
and . ' • ' 

' ■ '• f {^■"•2|f|'''|3'.-:,b^.-(®’3)\'and;zbro';elsfewhere.' ' ■ 

These laws are drawn below. Both teve two points of discontinuity. 





18 


COMBINING TWO PROBABILITY FUNCTIONS 


while is a similar curve turned in the opposite direction and 

has an infinite slope at (1,0). 



The law for the sum, x-fj/ = s- is 
f y for (0,1) 

- I )/V2'~0J 7 for (1,2) 

r Q ^ elsewhere. 

Fl^) is somewhat of a surprise for it is equal to zero at the origin 
and the point (2,0) and approaches infinity from the right and from 
the left at the point (1,0). The slope of the law for the sum is 
infinite at the origin and at the points (2,0) and (1,0).^ Fc^) 
appears below'. 




WILLIAM DOWELL EATEN 


19 


3. Let , for (0,1) and zero elsewhere and / 

for the interval (0,1) and zero elsewhere; then the probability law 
of Wr is 'A.(w] ■= -^ 7 =^ , for the interval (0,1) and zero 

elsewhere* Let , then the probability ■ function for «. is 

interval (0,1) and zero elsewhere* 
According to the theorem used in part I the probability law for 


the sum, 

F(^) 



2- ,is 

j for the interval (0,1) 

arc cos ^ , for (1,2) 

O , elsewhere. The plot of F(s-) :s below. 



The functions "^(w) and are shaped functions with 
infinite slope at the origin and are equal to f(x) in example 2. The 
law for the sum of the squares in this case has one point of discon- 
tinuity which is at the origin. The function for the sum is constant 
throughout the interval (0,1) and is equal to an inverse cosine 
function throughout the interval (1,2). 

4. If = for the interval (0,1) and zero else- 
where and - for the interval (0,1) and zero else- 

where, then the law for the sum, X-h ^ is 

^.6 15 ^ > interval (0,1) 

Foi) =i .i>(- sA-h^a^SO’^^-hloo^-iSifU) for (1.2) 

■ Cy elsewhere. ■ 

The function /Y?) has three modes and has its Itighest point 



20 


COMBINING TWO PROBABILITY FUNCTIONS 


where one^ would least expect it, and has large slopes at the origin, 
and at the points (1,0), (2,0). To appreciate the nature of 
here trie graphs of the functions for % and ^ should be examined* 
They are ii-shaped curves which are tangent to the horizontal axis 
at the middle of the interval (0,1). See the second figure in L 
F(^) is shown in the following figure. 



r 40 j 



ON THE SYSTEMATIC FITTING OF STRAIGHT 
LINE TRENDS BY STENCIL AND CALCULATING 

MACHINE 

By 

Herbert A. Toops, 

Ohio State University 

Whenever there is only one plotting point corresponding to '/¥ 
successive abscissal values equally spaced, it is possible greatly to 
simplify the fitting of straight lines to the empirical observations. 
Let the N several absicissal values (ordinarily time) be 

Let these several X values be replaced by a series of transmuted 

steps, , ^). 

Let the several corresponding ordinates be 3^ / ^ . 

The situation is represented in Figure 1. 

Fig. 1. Illustrating the Notation Employed 



Letting the equation of the fitted straight line be 

( 1 ) 

it is well known that the solutions, by least squares, for the two 

/V 2 %*’i - 2.yf 2 'i 


constants are, 

(2) a. = 

and 

( 3 ) # = 


1 


22 


SYSTEMATIC FITTING OP STRAIGHT LINE 


Also inasmuch as the % coordinates are an arithmetical series, 
we may substitute in the above for and as follows: 

(4) 2 x - /v(/x+/) , 

(5) /v(/v-n)(^ri-i-i) , 

thus yielding, . 

Cy/V+£}2 "y - & 2 x'f/ 

(6) ’ 

{7^ 4-^ . 

Al 

which equations, if of infrequent usage, are highly serviceable. It 
is possible, however, to proceed to the derivation of formulae still 
more useful for systematic fitting of straight line trends. Thus, 
there being only one ordinate to each abscissal value as assumed, 

(8) ± •j ^ % i- 

( 9 ) s" x'y = ' y, + 2 ^ 

It will be observed further that the denominator n(N-i) of (6) is 
invariably an even number, and therefore exactly divisi])ie by 2.”'- 
Substituting (8) and (9) in (6), and then multiplying both numer- 
ator and denominator by we obtain 

fio) a= ^/s^yj 

^ A/ /K- ' ) ^ 

2 

an equation which is a function only of the several ordinates and 
of M . Furthermore, this equation when solved for specific values 
of A/ ' leads to a system of equations remarkably simple ; and more- 
over 'one easily extended indefinitely. 

Thmvwhen ' 

{11) tsLg = -r Czy^-y,) 

^ The desiderata are : 'd;, obtain aTormiila which shaft obtahi^ias 
small piultipliers of the several ■ as , possible, ' cdu$istent' withdi. / .T 

yd',;;' V.2.'/ 'Integral ' 

3, A« integral numerator. 



HERBERT A. TOOPS 


23 


(12) When N = ^ 3 ^ i 

( 13) When N ^°':!^-3%) 

etc., etc. 

The symmetry of arrangement is more readily grasped if the sev- 
eral coefficients of the , and the denominators,!^ , be collected 
into an orderly table thus (Figure 2) : — 

Fig. 2. Systematic Solution of Equation (10), for Specific Values of , 

for Finding O- . 

^ ^ OlH -Erjl . 

Rule; Extend and cumulate the successive by the stencil multipliers 
of the row of the table appropriate to the problem (determined by 
A/ ) in question. Divide the accumulated sum by the denominator, 
, of the same row. The resulting quotient is <a* . 



determined quickly by 


L Simply choosing the appropriate row of multipliers for 
the number, N , ol successive plotting points available-; and 

2. Extending the several by the appropriate multipliers, 
most conveniently done by calculating machine ; 

3. Dividing the sum so obtained by the appropriate divisor, 

^Obviously if any plotting, point intermediate /between ,and is; 
missinff it must be supplied (bv interoolation’^i AT»i(fvtr»v;*n(w fU?© 

















SYSTEM/ITIC PITTING OP STRAIGHT LINE 


The multipliers may be extended indefinitely for larger and 

larger values of a/ by simply noting that the diagonal marginal row 

increases by the successive addition of -1, while columns increase 

by the successive addition of 2 ; and rows decrease by the successive 

addition of -3. The denominators have a constant second order 
a 

difference, /S =1, and consequently may be prolonged readily. 

Let us now return to , equation (7). The denominator 
/V 3s always divisible by 6. Hence, substituting (8) and 
(9) in (7) and dividing both numerator and denominator by 6^ 
we obtain, 

iL 

In like manner, this equation when solved for specific values 
of /¥ leads to a systematic series of equations : 

(15) = f (where /Vs z) 

(16) -^3 = (where /v = 3} 

07) (where 

The corresponding table yields Figure 3. 

Fig. 3. Systematic Solution of Equation (14), for Specific A^alues of // , 

for Finding 

^ = CLP 4^ 

Rule: Extend and cumulate the successive ^5 by the .stencil multipliers 
of the row of the table appropriate to the problem (determined by 
/V ) in question. Divide the accumulated sura by the denominator, 




















HERBERT A, TOOPS 


25 


The extension of this table is readily made by observing that 
the diagonal marginal row increases by adding 1 ; the columns^ by 
adding -1, and the rows, by adding 2; while the denominator has 
a constant third order difference, ~ 1, 

For hand computations these two tables. Figures 2 and 3, are 
undoubtedly simplest because the multipliers are smallest. If, how- 
ever, a calculating machine is available, the magnitude of the mul- 
tipliers is of relatively small moment if anything is to be gained 
by using different multipliers. It is obvious, for example, that the 
several multipliers of a row may be divided by the appropriate 
denominator, the resulting decimal multipliers, to replace the pres- 
ent integral multipliers, being presented in tables of A/ columns or’ 
sections. 

An even more useful set of tables for general purposes may 
be derived by reducing equations (10) and (14) to a common 
(integral) denominator, so that the same denominator may be 
employed for calculating both a and if . ^ 

We may obtain the least common denominator, by 

multiplying equation (10) by ; and, equation (14), by mul- 

tiplying by , thus: 


( 18 ) 


( 19 ) 

^ = 


2 . 


z. 


Accordingly, it follows that if the three following changes he. ef- 
fected, we shall have an integral system : 

L The previous table values of cl to be multiplied by 
of the row in question, throughout. 

2., "The previous table values • of be multiplied .by; 3 

throughout; ' 



26 


SYSTEMATIC FITTING OP STRAIGHT LINE 


3. The common denominator, ^ of any ro\¥ in qties- 

tion to be made to be 3 times the previous denominator of of 
the row. 

The two sets of multipliers may now be combined into one 
systematic stencil (Figure 4) with a common denominator J> or 
cominoii reciprocal, ^ . The directions for using this stencil are 
as follows: — 

L Count the number of plotting points. 

2. Find the row of the stencil having the same number of 
plotting points, (/y ), 

3. Record the ^ values for the successive plotting |x>ints 
in the little rectangles of the row just located. 

4. Using a calculating machine, obtain the summation of the 
extensions of the several values by the multipliers just mime- 
diateiy above, employing a fixed decimal point. 

5. Divide the siim just found by the divisor, 2), at the left 
hand of the row. The result is 

6. Similarly obtain the si|nimation of the extensions of the 
^5 by the multipliers of the several respective windows umm- 
diateiy beneath, again employing a fixed decimal point. 

7. Divide the sum thus obtained by the same divisor, . The 
result is IX . 

8. Substitute values of cl and tr in 

( 1 ) ^ ^ £L + 4-/, 

9. If we summate (1) we obtain the checking equation, 

(20) 2 -h /Y(/ir-n) ^2 > 

since S-y! = a^Ca^fi) . 

Now, !et us employ the revised stencil on a problem (of perfect 

■fit).:—, , ' , ; , . • ''v, ■ ' ^ ; 



HERBERT A. TOOPS 


27 


X 


X 

(Age) 

(Attainment) 


3.5 

12.72 

1 

5.5 

22.45 

2 

7.5 

32.18 

3 

9.5 

41.91 

4 

11.5 

51.64 

5 

13.5 

61.37 

6 

15.5 

71.10 

7 = N 

66.5 

293.37= 2^ 



Fig. 4. Revised Stencil for Solving Formulae (19) and (18) for*^ ando.^ 
respectively, ^ s a ^ • 

nQ^ P 3 ('v+>Ky, + 

z 


( 1 ») CL= Af(Ar^-l) 

z 

T>- 

z ' 


Multiplier of Ordinate No.: — 


' 1 

2 

3 

4 


3 




1 

I 


6 

-3 



..-6 

0 

6 


LJ. 1 1 

16 

4 

-8 


-9 

.-3 

3 

9 

LI Ml 

30 

15 



-12 

-6 

0 

6 


5 60rT"-| I I I 

J....,, „r _ — * ' yy r‘'* ■ . ■ i . .» - f .■«» 


11 12 




8 


SYSTEMATIC PITTING OP STRAIGHT LINE 


Fig. a — Continued 




-IS 

-9 

-3 

3 

9 







6 

105 [ 


zn 

ZI 

zr 

u 

z 








70 

49 

28 

7 - 

■14 - 

35 








-18 

-12 

-6 

0 

6 

12 

18 





7 

168 [ 

n 

1X1 

j. 

ZI 

z 

i ' ' 

1 






96 

72 

48 

“24 

0 - 

-24 « 

-48 






2521 

“21 

-15 

“9 

-3 

3 

9 



Jl- 

21 




8 

:□ 

Zl 

[ 

1 

ZI 


X 

IZ! 






126 

99 

72 

45 

18 


-36 - 

-63 






- 24 

-18 

-12 

-6 

0 

6 

12 

18 

24 



9 

360 l 

Z] 

□ 

n 

Z3 

ZI 



IZ 

□ 





160 

130 

100 

70 

40 

10* 

-20 ■ 

- so - 

-80 





“ 27 

-21 

“15 

-9 

-3 

3 

9 

ls 

21 

27 


m 

4951 

X 

EZ 

□ 

z: 

Z 

tz 

IZ 

rn 


1 



198 

165 

132 

99 

66 

33 


- 33 * 

- 66 - 

99 


i 


--30 

-24 

“18 

“12 

-6 

, 0 


JL 2 . 

3-, 

24 

ZQ_ 

11 

6601 

LI 

1 


z: 

X 

e: 

n 

X 

rrj 



240 

204 

168 

132 

96 

60 

24 

“12 

-48 - 

-84- 

•120 



-33 


-21 

-15 

-9 

-3 

3 

9 

15 

21 

27 

12 

858 

1 

tz 

nz 

i 1 1 i 1 1 

rrn 



286 

247 

208 

169 

130 

91 

52 

13 ' 

- 26 - 

■65 

-104 


Since the X -coordinates are replaced, for computation, by the 
series, the following transmuting equation prevails ; 

( 21 ) x'= .5X -.75 . 

The stencil set up, employed for seven yj , is : — 


UznZ 


- n 




tz 


fS 




^6 


3ZAS 






6/37 


7 A 


7Z 


AS 




fis t-'HfZ.7Z)~f^(^Z.¥5) i- I8(7l.i/>)J= %730 


a, = -^s(7/./c)j= Z.fie 








HERBERT A, TOOPS 


29 


At this stage, application of formula (20) proves the correctness 
of this equation. 

Now substitute for its equivalent (Axi-H5 ) and 

which may be checked by summating, 

?<&5£X--y.3<J75'/V= H.%<o5Ci>L5)~4.3oiS ii)-^ 

£e. 37 = ^'^5.37 . The check holds. 



STATISTICAL ANALYSIS OF ONE-DIMENSIONAL 
DISTRIBUTIONS 


By 

Robert Schmidt 


The present research is to be considered as a contribution to 
a range of science in which the pioneer work has been done by K. 
Pearson. The method for analysing statistical distributions to be 
developed here differs in principle — as far as the author can see — 
from the known ones. The mathematical resources are all well 
known and so simple that their deduction ab ovo could be carried 
through on a few pages ; hence this investigation is intelligible to 
anyone who remembers his mathematical knowledge acquired at 
school. 

The main resource consists of the process of orthogonaliza- 
tion^ fundamental in the theory of integral equations. The central 
idea characterizing the following is, not to deal with a frequency 
function itself, nor with its integral function, but with the immerse 
of the integral function. The general scope will be given in No, 3. 

The author is indebted to his wife and to Mr. J. L. K. Gifford, 
M.A., of Queensland University for kind help in revising the 
English text* 

L ' Designations and General Assumptions 

A curve shall be called a frequency 

curve'', the function a' ^Jreqttency function', if satisfies 
the following conditions : 

1. my ^ o (- 

2. The moments 2 : cLic exist for 

3. ' /. , 

For our ■ purposes it is oemvenient — ^though not necessary — ^to 

' ^ In , this' .paper we.' shall not have to' make ii$e of the second condition 
(except in the special' ca» )';Tn 'further , notes, too, 'the condition will 

tiever'l^ applied to; its 'Wl, extent. 



IBERT SCHMIDT 


31 


add a fourth condition 
the function 


v\'hich it is simplest to formiilate by using 


f fCt)clt. 


This function is constantly increasing in -*q0 » and 


we have 


'Si)ryL 


i4-/v '* 

YL^j> 


- G 


4c%) = / 


The fourth condition is to guarantee that /(k) assumes every value 
from o(i^{f -just once, so that possesses a unique inverse^ 
function in the ordinary sense. 

_ 4 a) is continuous 

h) At every y where a<( O ^ is in™ 

creasing (strictly speaking), that is: From it always 

follows that { <^(%) { 

When the conditions 1-4 are fulfilled, let us denote .^<x) as 
the ^'frequency integral''^ of the frequency function 

Then there exists one and only one function in , 

satisfying ^ ) 

and f(f] is called the immerse function of^) . This function 
is continuous and constantly increasing (strictly speaking), and 
therefore possesses a unique inverse, namely : 






We give here some special examples of frequency curves. 
I. The “Step Curve”. 

u = = I ^ 

<f V / I ^ otherwise. 


The moments are = — - 

H-f-f 


The frequency integral is 


= 



The inverse is 


fW = •i 




- CO <x< o 

0<5C<# 

I 4 * <-t-oo 

(oi'fO) 



32 


ONE-DIMENSIONAL DISTRIB UTIONS 


II. The Normal Law of Error. 


with the moments 


(:l 77)! 


for /<= £n. 

0 for - ) 


and the frequency integral ^ ^ 

* -txr 

There are a number of tables of the numerical values of this 
function. Of course these tables can ht used to compute the values 
of Considering the fact that, for our purposes, the values of 
f(^) will often be required for simple rational arguments only, it 
seems useful to have tables which are converse to those just quoted, 
that is to say, the tabulated entry of which is x* and the 

argument ^ = . Such tables have been calculated by Kfxuey 

and Wood (Statistical Method, New York 1924; Appendix C). 

Ill, The Laplace Curve, 

-Ixl 

/ = e 

f for K=-2 71 



for 

2yt+-l (^ n 

. f 4«* 

in 

- oQ ^0 


in 

<J 4 X < + oO' 


in 

'/x 

1 ' 

in 




ROBERT SCHMIDT 


33 


IV. The “Tine Curve”. 







f 

ill 

— 00 

<v <-/ 



j 

I l+X 

in 

- / 

4 X < <5 

11 

fa) 


1 /-y 

in 

a 

< xO/ 



1 

1 0 

in 

-h 1 

4 X 


i 


0 

in 

— 0^ 

<x <- / 


■ 

1 

X 

(l-hicf 

in 

« 

in 

- / 

0 

4 x < 0 
< <+' 


V 

HI. 

/ 

in 


4 X (-f-oo 


/ 


in 

0 

< ^ < Vz 

n 

/- iTF-a^ 

in 





2 . 

Ekke's 

“Best Values” 


A. Ekke, in his Kiel dissertation (to appear), deals with the 
following question among others: Suppose a, frequency function 
and a natural number 72- given. Which one among all systems 
of n values might be considered the “best'’? — To give 

an answer to this question, Ekke divides the total x- axis into ?t, 
parts ’ \ with the separating points in such 

a manner that r ^ 


«r, 




Evidently this is possible in one and only one manner, and 
we have 

^ = = f('^) 

Each of the parts should contain exactly one value of 

the system. Furthermore it seems reasonable to fix every point 
within its interval X hy the conditions ^ 

/ 

This also can be done in one and only one way. Let us designate 
these '"best vdties^' hy We have ' ' 

(i) ?, ‘ 





34 


ONE-DIMENSIONAL DISTRIBUTIONS 


Concerning the best values, Ekke proves two theorems which 
accentuate the rationality of the definition. If ‘ are 

values arranged according to magnitude, and 

< X/ 

4 X- < >v+ , (‘^= ■, »-l} 

Hi ^ % <1 +• ■ 

“There 'is one asnd only one system X, X for which 

4-C80 ^ ^ 

_^/ i x,/ -. Hi)} 

asmmes a mifdmum, and this system is >7 - ^ 

This theorem also holds if the exponent 2 is replaced by an arbi- 
trary positive number, — Furthermore: 

''There is one and only one set f<^r tvhick the low- 

est upper boundary of 

I ^(x) - 5(>-; I 

assumes a minimum, cmd this set again is identical zmth / * V 

For normalizing purposes Ekke considers, together with a 
given frequency function the totality of the frequency func- 
tions which result by linear transformations of the argument, i.e. 
which result by translations and dilatations in the direction of the 
X- axis (or by choosing new origins and new units of measure- 
nrient). With an arbitrary , and *y} o we have to form 

the first factor 27 being required in order to comply with con- 
dition 3 ." The frequency integral corresponding to f(%) is 

and the inverse ' 

Due to this simple relation between i)'Cj) and , we have 
evidently, if designate the best values of ^6c) , 


o in 

/ in 

the following theorem holds : 



ROBERT SCHMIDT 


35 


This fact caii.be used to pick out from the multitude of functions 
^i%) a distinct specimen, and then to 'Operate with its best values 
only. It is easy fo show in a direct manner that there is exactly 
one specimen in the multitude which complies with the additional’ 
conditions =: 0, = 1. 

3. The Starting Point, General Scope, 

But the proof of the fact just mentioned can be given indirect-: 
ly too by considering the inverses , and it is this way which 
gives the starting point of our further developments. _ Indeed, if 
we introduce — for simplicity — Stieltjes integrals, the conditions 
= 1 mean 

J % d = <? . J % d ^(f.) = / , 

and by the substitution x , 

Let us put 




Then our conditions are equivalent to the following demand : Find 
coefficients o^j in such a manner that 

We add: The functions f^(f) and are linearly indepen- 
dent, i,e. ® cannot hold except for — 0^ 

Now it is obvious that odr demand represents a special case of 
the general problem as follows : Given a set &f linearly independent 
mntinumis funcf-ions fjf), • THe' scheme 



36 


ONE-DIMENSIONAL DISTRIBUTIONS 


of coefficients 

g Q 0 ......... O 

'cm 

g 0 o <5 

i-'io ' 1 1 

^KO ^Ki ^Xz‘ f^XK 

satisfying the additional conditions A >o^ A )o ^ 

shall be chosen so that the functions 

Acf) • A. fJA 

'f.t'i) *-Aj W 

--A„ 

form a normalised orthogonal system, i.e. 

f -y -j C I ior -0- f 

It is well known that there is one and only one suitable scheme, 
and it is furnished by the so-called process of orthogomilifsation. 
Furthermore it is well known that the process of ortliogonaliza- 
tion is intimately connected with another problem: An arlhtrary 
continuous function F(p gmen, to defernme the coefficients 

^0 that J ^ (^)]j ^ 

assumes ' a minimum, 

, Concerning' frequency .functions, we are led — by pursuing 
this line— to a general theory of curve types; ,an account of the 
ytsults^ to be' obtained^ will.be given in a future article. ^ ^ 

' 'Concerning our', aualysis of 'Statistical data, we do-not intend 
to use from. a given.' frequency .'function mO're thandts best values. 
More precisely: toe mtend to reph€e'the':ffequency fimctmn by its-' 



ROBERT SCHMIDT 


37 


best values. Oiir modus procedendi now results by analogy : ,we 
have, to deal with systems of values (vectors) 


(s, , 


• *, ^,rr) 


> ' 

Lt ) 

{ •? 


‘ ■ , ^ 


which are linearly independent (see No. 5). We have to employ 
the process of orthogonalization, which gives a norinalized orthog“ 
oiial system (see No. 6) 


(K 7 

^. 2 > ' 




• " , ^x-n) 


> 

) } 


and we have to direct. our attention tO' the s«nis of. the form 

or better 

■ * i* 2 ^ 

Finally we have to introduce the special set of -vectors'*. 



■, ') 

( ■ ■ 

■ , 

(<: C" 

• 7 J 


where designate the best values of a frequency function. 

We are now in the position to characterize the direction of 
our research in, general words: A statistical analysis of distribu-- 
tiom as m mpplicadon of tJw theory of orthogonal systems^ based 
upon the best mims of a given frequency function. 

4. ’ , Vectors 

i^of dtsr ptirpose it is dohvenieni to make' 'use df the noMtidni' 



38 


ONE-DIMENSIONAL DISTRIBUTIONS 


and simplest operations of vector analysis. If ti^are a set 

of iinmbersj we take the symbol as an iiidividual^ call 

it a vector and designate it by a gothic letter : 

* * j ‘ 

Bquality of two vectors ^ N ) and ^ ‘ ^ ^ 

is defined by 

and is written , The products of a tmmber c with a vector 

Jp are defined by 

- (c a,,. • -,01^) 

the 5«s» two vectors At mA^JO by 

= (a^-hv, , } ■ 

Evidently we have 

CyU^ ^ ^ 


and 

CJt T-AO) 4- MO . 

-idence we may omit the brackets, and the sum of three or more 
vectors has a definite sense. More general, the meaning of the 
expression ; ^ 

is clear. The product of two vectors is (somewhat differently from 
•the customary way) defined as a numbee, viz. 

= 4i(u,v, -t... 

and we have 

=: AOji^ 


4" A0^4iCI ■==• ML' yi40 4 MA A4X0 ^ 

But in general the vectors and At(A>-A0) are entirely 

different. 

Let us put ^ ( Oj ^ ^o) ■ 


Every vector satisfies ^ 0 , and v^*v<?' is the only 

vector for which ^ == © holds. — Whfenever the square root' 


of the square of a v^tor, 



ROBERT SCHMIDT 


39 


is met with, we always mean the positive value. 

5. Linear Independence 

A set of vectors ^ ^ is said to be linearly independent 

i 3 J ^ 

if the equation 

does not hold except for- ^ - ....=1 r=0. Otherwise the 

i 2 - K 

vectors are said to be linearly dependent. If the vectors 

A. a 

are linearly independent, all the more the same is true for every 
partial system. Especially : - 

Theorem '"Let be linearly independent; form 

ike vectors 

Mp = +■ ^ 

and suppose 


Then the vectors are also Ihiearly indepe^tdentd' 

In fact, if there were a relation of the form 

-5^ li ^ 

R ‘ * - - -f- 

and the factors were not all equal to zero, then there 

would be a last factor differing from zero, say , and we should 
have ^ ' ^ f ^ 

Af ^ • . , 4 . 0 } 

if we were to, replace by the expressions (2)^ we 

should get a relation of the form ^ ' 4^ 

/A,, Jii -f- A 

which is impossible on account of and the presup- 

^sed' linear independency of ■ 

In order to, prove some further theorems it is conveniemt — 
but not necessary — to make use, of' the, following' fundamental 



40 


ONE-DIMENSIONAL DISTRIBUTI ONS 


theorem concerniBg systems of homogeneous linear equations^ 

"'A necessary and sufficient condition that the system of eqm-^ 

iions 

■>- 

should have no other solution than = *^; 2 , ^ ~ ^ ^ 

^// ' ‘ fi 

^ O , 

^tr/ * * ‘ ‘ ^ftrt 

From this statement at once follows : 

Theorem 2. necessary and sufficient condition that the 
vectors 


JC = 

( % r 

^iz> ' ' 

} 

= 

^ > 

^flZ > ‘ " 

■ ^nn) 


should be linearly independent is 

' ^n.n| , 

In fact, linear dependence of equivalent to the 

existence of values ^ equal to zero, satisfying 

A, +-\ a^, . 4- A.^ u^V* o 

X, a,„ +■ Xj +- = 

and the determinant of these equations is .equal to the .determinant 
above. 

Theorem 3. “If are linearly independent, the 

number K ef the vectors cannot exceed the number tv of the com- 
ponents: X ^ , 

We prove this theorem by showing : 



ROBERT SCHMIDT 


41 


VH vectors 


are given, they are linearly depende^it/' 

For obviously, the determinant of the equations 




f- 

^riH 

VL , 
TfH^l 

= o 


f • - ■ 

^ f- 

^?7H 

U. 

^ o 


f • ' 

‘ -f 


0 

o 


vanishes, hence the system possesses a solution \ 

fereiit from 0^ * ^0 , and with such values ^ 

first n equations mean 

^ • f- 

6. Normauzed Orthooonai, Systems of Vectors 
If = / , the vector Ji is said to be normalised. Every 
vector can be normalized by multiplying it by . 

If , the pair of vectors a,nd/iP is said to be 

orthogonal. The vectors ^ are said to form' an or- 

ihogonal system, if every pair of them is orthogonal. 

Finally the vectors ^ ^ are said to form a normal- 

ised orthogonal system, if they form an orthogonal system and each 
of them is normalized. Accordingly a normalized orthogonal sys» 
tem is characterized by the conditions 





/ for = ^ 

O for 'f 4=- . 


Vectors forming a normalized orthogonal system necessarily 
are linearly independent. For, from 

\ ^ -h - ■ ■ -t =^ >< 5 ^ 



42 


ONE-DIMENSIONAL DISTRIBUTIONS 


■Mtows 

or ‘ ■ 


A/ f- * ' • -f* - o J 

and from (3) : 

• 'r, O- 

7. The Process of Orthogonacization 
Theorem 4, '7/ the vectors Ihi early indc- 

pendentj there is one and onh one ^scheme of vaiues 

4/ ' C4, >0 ) 

A, At 

A; ^Kz ' " ‘ 

so that the vectors 

=N.> ^ 

--A/ -4 Aa A. 

(4) 


- /3^j yPP^ -f 


f- ^ 






/orm a normalised orthogonal sysiemf* 

To prove this theorem, fundamental for our analysis, let tis 
consider 





^ * 





(5) 

jq = 


aJ 

ypi/ 

■h 



-^K “ 


-4 

-f- Y' 4 +• • 

'KZ * 

■y A 


From theorem 1 it follows that the vectors * ' 7 
linearly independent. — Let us assume we have already proved that 
there is one and; only one system of values, Y (5) is aii 

orthogonal system. Then it follows firstly that the coefScimts /3 
iii (4) can choseiidn ai least stlitahle maiinet. For we have; 

' ■ ■' S ^ >«,'* ■ • * 



4 


'fyN, 

'JMIN 


JL 

' li' 


ziy 


t 


ML' 




ROBERT SCHMIDT 


43 


are suitable values. Secondly we can deduce the uniqueness of the 
coefficients in (4). For suppose /3 and /3 to be two suitable 
systems of coefficients ; this suggests that we form 

^ y T* - " ^32 - • , . . ; if. 


Y' =r 

associated with 


r 


A' 




A 




' T - 




JO, 


• 


and 

y-^- 4 

hi 


jiQ ^ ~ 






-u 

associated with 


^ S 

Y" =r ' 


a: 


^ s 


'3J! 


A 


33 


r 


t 


«• 

■ 


^;k-, 4 


ft 


- 


A, 


•k , 


JO. s .<£? 






The vectors , as well a * . • * form, orthog- 

onal systems of the type (2), hence 

I,. 

and furthermore ^ 

Finally, because of the linear independence of ^ • 

Accordingly we may confine ourselves to proving the existence 
and uniqueness of suitable coefficients y in (5). 

This proposition is true for / . 'Let H^Z arbitrarily, and 
assume the proposition to, be proved up to k*-”/ » The. vectors 
^ ^ therefore are orthogonal, and we have to show only: 
There is one and only ofw set of values Y * - * ' \ Y so that 

fHI ' 

the condemns 

( 6 ) 

are satisfied. 

The vectors Jt aL , be represented as linear com- 
binations of ; Xb . • 




Z K 


* Y 
^ \k-i 





44 


ONE-^DIMENSIONAL DISTRIBUTIONS 


We introduce this into JCL , and get 


( ^ ) X C -f* 

Kt ^ 

mdth 


^ '^K-, ^ '^K 


(S) 


^Ki = ^ICZ^ ‘ '^K-l, 2 X<, K- 




Tk.k-i 


From Ae linear independence of we have 

£) and therefore we can deduce from {7) : 

* ^ f V 


(9) C 


Kl 


yto, c 

7^’ 


'^M*S 




The coefficients ^ having to satisfy the equations (8) 

with the values (-9) of , there can exist only 

one suitable system * f" 

'k;jr-r 

Conversely: if C< *<-i chosen according to (9), 

and then ‘ i ^ calculated from (8), evidently there re- 

sults a vector,^ satisfying (6)* 

We add : 

Theorem 5. linearly independent^ and 

jiq ^ j A40 is the corresponding normalised orthogonal system,, 
then the normalised orthogonal system correspond--* 

ing to 

(<1^^ ^o) 






r 


M' + <X^, ^ 




is identical with JiO 




Obviously the vectors >C7*^are of the form 


fc/ 

= B., ^ 





ROBERT SCHMIDT 


45 


The proof of theorem 5 now follows as an immediate application 
of theorem 4. 

’8* C0MPI.ETE Systems of Normalized 
Orthogonal Vectors 

A system of normalized orthogonal vectors is 

said to be complete if, corresponding to every arbitrary vector 
there exist coefficients < so that 

>3 S ic 

(10) • f-/c^ )j- = (? 

holds. Evidently, (10) is equivalent to 

Ji. n ■ 

Theorem 6. “If the vectors form a mr- 

maUsed orthogonal system, then this system is complete.” 

Proof. According to theorem 3, the vectors /W 

are linearly dependent, i.e. there is an equality 

and * V > A are not all equal to zero. The vectors A40 ^ 
being linearly independent, we have necessarily 0 . Hence 

Jl s 

The condition - is also necessary for completeness, but 
we shall not have to make use of it. 

9- Approximation in the Mean 
Let us consider a normalized orthogonal system , 

and an arbitrary vector/^ . We wish to determine the coefficients 
‘ *1 ’ 4 r such a way that #c ' 

assumes a minimum. If there exists a suitable set of coefficients, ^ 
we say that the corresponding linear combination -< 3 ^ 

gives a ^^best approximaSon in the mean*' to the vector 



46 


ONE-DlMENSlONAL DISTRIBUTIONS 


The following transformatidns will at once clear up the sit- 
uation ; 

4 z ^ 

{ -^1 = - ^^1 A . '^ x '^ 'h ^ 

- m£ -- ^ ^ i ^ M-X 

X«/ ^ ^ ?f*/ ‘ ^ * J > 


and if we designate 


^2= 




we have the fundamental equation 

(11) 


On the right hand, the coefficients are not met with 

but in the last sum, and this sum assumes a minimum for == 
only. By that, we have : 

Theorem 7. Among all linear combinaiions of the normal- 
ised orthogonal vectors there is one and only one which 

gives a best approxkmiion in the mem to the vector Jt , and. the 
"best coefficierntY are 

The equation (11) admits some important conclusions con- 
cerning the coefficients' putting 

- A '^ * * * * "^ * 

, , iff ) H 

we derive 

( 12 ) {jl~ i • 

The left side herein evidently is not negative, hence 

(13) £t^+- . - - • 4- CL* ^ ^ - 

Finely, if is a complete system of normalized 

orthc^nal vectors, the preceding reasonings of course hold for 


every 

(1^> 


/f X (yZ • ' • 72- • But we can show more than ( 13) , viz. 


'''a*'#-'.-';' 






ROBERT SCHMIDT 


47 


Indeed^ according to the definition of completeness we have 
with suitable 4 j “ * ‘ j ^ • 

^ j = 

and a fortiori, by theorem 7, 

= O, 

which is, regarding (12), equivalent to (14), 

10. The Tchebychef Coefficients 


be a set of best values corresponding to a given frequency function 
^[%] (see No. 2). We form 


(15) 


I, ■■ ■ , ' ) 

'S, - ( 4^, ■■■ j ) 

C- > 4Z) 


The vectors 



means 


'* * y linearly independent. For 

. 4 - 

• • • + - o c^=ly •, 7i), 


that is, the polynomial 

FM = 

of degree ^ possesses T% different zeros 5^^***^ 

But the number of zeros of a polynomial cannot exceed its degree 
unless all coefficimts vanish. Hence o . 

Let us designate the (complete) set of normalized orthogonal 
vectors corresponding to , • • - , "6 by "t }■»-, ' 

When we have to deal with a set of observations •» ^ » 

there will not be any practical loss of generality if we assume these 
values arranged according to magnitude, 




^zi' 


4 x 


1 ^' '7 



ONE-DIMENSIONAL niSTRlBVTIONS 


and to be not all equal. Then we define the vector & by 

^ ( 'f{ y X-X > ' * ‘ ^ 

and wt propose to call the coefficients 

''Tchcbychef Coefficients'' of • 

The central position of the Tchebychef coefficients for analyz- 
ing purposes is pointed out by the following theorems 8 and 9« 
Theorem 8. ''The set ‘ ^ fortiori the 

Tchebychef coefficients (Z^ ^ of the observations 

f} Ay " ' / depend on the specied frequency function , 

but on the type only to luhich ^(%) belongs*' 

To prove this theorem, jet us consider, besides fb^) , an arbi- 
trary individual of its type, 

fw = g>(4p(%-/3)) (^yo). 

The best values corresponding to (see No. 2) 

% = X- ^ ^ ’ 

and we dedtlc^if ^ designate the vectors (15) ob- 
tained from 4 ^ instead of ^ ... ^ , 

i. - t, 

'B, - /3t^ -h-f 

Hence, by an application of theorem 5, the normalized orthogonal 
vectors are identical with . 

If we choose a hew unit of measurement and a new origin, 
that is to say if we perform a transformation 

thp vectors Joy" Jrin change (by the reasoning just 

finished) . The vector /jg changes into 

and we haver: ■' ^ . 



ROBERT SCHMIDT 


49 


^Theorem 9. "7/ a new unit of measurement mid a new etri- 

gin are introduced, say 

X = °r X -t/S ) , 

then the Tchebychef coefficients change into 

=: i ^ ^;z “ ’ 7 ^77-i ^ ^77-/ * 

11. Mean AND Dispersion. Coefficients • 

OF Skewness and Kurtosis 

Preparatory to the definition in this chapter, let tis consider 
<2^ and especially. To begin with, we have 

- r /,/,•••, I) 


and therefore 


a.^ 


'TL C ^ ‘ ‘ ‘ } • 


The proof of theorem 4 furnishes a convenient vray to com- 
pute , We put 

and determine “f so that f =0: 

r= - ^ 

With the designations 

i C^, ™ ^ C ) 

we obtain 


Hence 






7n,^ 

Concerning ^2.,we have now to deal with a theorem which is 
of the greatest importance for our purposes. 



50 


ONE-DIMENSIOMAL DISTRIBUTiONS 


Theorem 10. 

itive : 


“The Tchebychef coefficient d-, is always pos- 


7 t 


For a proof we can proceed as follows: if we designate the 
coni|X}neiits of by •• 

and = o . 

From this we dedoce the existence of a subscript so that 

S, <■•■ " 

Let us put 
Then we have 

2, <0, Z-„„<Z,„<... , 

winch gives 

On the other liand^ the identity 

holds. The differences ^ are all ^^.atid 

^71 being subjected to the condition not to be all equal, at 
least one difference really is positive. Hence 


O, 


CZ-, * » + > O. 

There are no restrictions for the Tchebychef coefficients dif- 
ferent from iX, as faf as their signs are concerned. 

The reader, after having verified the truth of the following 
statement, will now be prepared to accept the definition below, 

^ 7 / the mciar ^ is of the form 

. ^ 4 -. 4 } ^ ^ % 7 

the d§n of d^coincMes tmih 'that of ; if it is of the form 

ike sign of€i^ coincides with that of -icmd so ' ’ 

DefinitiO'K," type ■ of frequency ftmciion being given, the 
Tchebychef €t^^/^ of ihe^obsermtions 

brcdledi , ; 



ROBERT SCHMIDT 


Si 


rr Mean of the Observations 
=: iT ^ Dispersion of the Observations 
^ Tchebychef Coefficient of Skewness of the Ob -- 
sermtions 

^3 “ Tchebycpief Coefficient of Kurtosis of the OZ?-, 
servations/^ 

We do not believe the Tchebychef coefficients with a higher 
subscript than 3 to be o'f any practical interest. 

12. Measures of Skewness and Kurtosis 
No matter how the mean and the dispersion of a set of obser- 
vations are defined, the dispersion will always have to depend on 
the unit of measurement, and the mean furthermore on the origin. 
But the case is a different one concerning the concepts of skewness 
and kurtosis. Here it is reasonable to raise the question for meas- 
ures in the strict sense. It is obvious that such measures will be 
obtained if the set of observations is — ^by a convenient choice of a 
new unit — ^brought to the dispersion 1 ; the new Tchebychef coef- 
ficients of skewness and kurtosis will be suitable. This leads to the 
Definition. ^'With the designations of the preceding chap- 
ter, the ratios ^ and shall be called: 

^ = 3 = Measure of Skewness of the Observations 

2i2l ^ K ^ Measure of Kurtosis of the Observations/" 
cb 

There will be no misunderstanding if we use the words 
^^Sketimes/" and Kurtosis'" instead of Measure of Skewness"" 
and ''Meastire of Kurtosis \ — Utilizing theorems 8 and 9, we have 
at once : 

Theorem 11 . ^^The measures of skenmess and kurtosis de- 
pe'ftd on the type of frequency function and on the observations 
only; they are independent of origin and unit of measurement/" 

13. Meaning op Skewness and Kurtosis 
To secure an idea of the mechanism of skewness and kurtosis, 
let us construct some examples which show' these ' phenomena in 



52 


ONE-DIMENSI OMAL BIST RIB UTIONS 


complete purity. We will use the step function, and we intend to 
choose the values • • •> so that they are affected— apart from 
the inevitable dispersion— in the first place with skewness only, in 
the second place with kiirtosis only. 

We take 72= /o , and for the convenience of the reader we 
actually write down the vectors ‘ ‘ 'j, ^3 • We observe how- 

ever that ill practice one will never evaluate these vectors, but 
rather compute the Tcliebychef coefficients in the direct manner 
described in No. 15. 


We ol’itain 


m 

Jo 




1 

n 

" 1.56670 

■h 

1.65145 

-- 

1.43388 


■1 

-1.21855 


.55048 

t 

.47796 


H 

- .87039 


.27524 

4“ 

1.19490 


mM 

- .52223 


.82572 


1.05834 


1 

- .17408 

— 

1.10096 

4* 

.40968 

6 

1 

+ .17408 


1.10096 


.40968 

7 

'1 

+■ .52223 


.82572 


1.05834 

8 

1 1 

+- .87039 


.27524 

... 

1 . 19490 

9 

1 

+ 1.21855 

•f 

.55048 

i 

.47796 

10 

1 

+ 1.56670 


1.65145 

4* 

1 .43388 


We shall have to come back to these vectors in No. 17. For 
this reason they have been calculated more accurately than is neces- 
sary here. 

I a. Positive skewness, 

't) ' i" ?Z (a. = l i = (3 otherwise). 

1 b. Negative skewness. 

= otherwise). 

2 a. Positive kurtosis. 

t> =1/'^ Til's otherwise). 

2k Negative ktirtosis. ' ' '■ 





ROBERT SCHMIDT 


53 


The components of the different vectors ^ are put together 
in the table below : 





7, 

-5 7- 


"' 72 s ^3 

1 

- 

1.236 


1.897 

i.. 

1.710 

- 1.423 

2 

“ 

lAm 


2.329 

- 

1.171 

- 1.266 

3 

- 

.925 

- 

.815 

- 

.751 

- .m 

4 ' 


.687 

- 

.357 

— 

..416 


5 


.394 

t 

.046 

- 

.133 

- .215 

6 

— 

.046 

4 - 

.394 

4- 

.133 

4- .215 

7 


,357 

f 

.687 

4- 

.416 

4 - .628 

8 

-f 

.815 1 

4* 

.925 

4 

.751 

4- . 9 ^ 

9 

4- 

1.329 

4 - 

1.108 

4 - 

1.171 

+■ 1.266 



1.897 

+ 

1.236 

•f- 

1.710 

+• 1.423 


To illustrate the preceding, we compare the vectors 'ts with 
their corresponding *^best systems of best values'^ that is to say 
with the vectors — 

1 -/ ’ 

and carry it through with some figures. We place the components 
of on a horizontal straight line I, the components of on a 
second straight line II below: 



1 a . 

rosltlTd Skaimees 


\ \ 

'Vi_ 

f / 

. / 1 

\ 


1 b . 

laisatiTa Skemiesa 


/' / 

/ 

i ' . V. 

\ \ \ ’> 

\ V ..'w... 

f y 

J * ^ 


a a . 

PositlT® EoxtOBis 


■ / t 

\ 

^ f f It'— - 

\ : / 

.Q. ...0 ^ 

1 \ 

i - ,, 


a h . 

XJurtoaiS' 


■ , ^ f 
; UU 

-r p 

J. — i— 

/ \ \ \ 
-0 6" h- 



II 










54 


(' ) :V E-DIMENSl ON A L DJS'TR! B U TIONS 


The reader should settle his mind upon the fact that the gen- 
eral behaviour (jf observations affected with skewness only or kur- 
tcjsis only is always the same, no matter which type of frequency 
function is Ci}!isidered. — The meaning of skewness and kiirtosis 
can be, general!}' speaking, expressed by: 

Posit hr Ske'wness — Overconcentration to the Left 
Negatk'e Skewness = Oz^erconcentration to the Right 
Positive Kurtosis ~ Overconccntration near the Mean 
Negative Kurtosis == Undercoficenf ration near the Mean, 

14. .Mkasures of Approximation 
Let T be a ry|>e of frequency function, ^ X^] a set 

of observations, and -k ^ I a ^'degree of approximttwP\ that is 
the subscript in the sum expression 

will give us a clue to the quality of approximation to the vector S 
which is obtained on the basis of the type T and the degree of 
approximation K . But the expression above of course is not yet 
fit to be taken as a measure of the quality of approximation. There- 
fore it will be necessary in the first instance to modify it so that it 
will become not only independent of the origin, but also inde|)endent 
of the unit of measurement. 

Regarding theorem 9, and making reflections customary in 
situations of this kind, are almost compulsorily led to the 
Definition, ' Thendites 




ciS ^ • 




(k-- /,Zy ■,■»-/) 


shall he called measurks ov approximation of thk ^eorees k P 
Theorem 12* ''The measures of approximation depend 
ou the i'Vpe T mtd on the ohsen^atkms only: they are mdependcnk 
of Origin and unit of ttuastiremeni. Furthermore they satisfy 

C) 'Ml ^ ^ ; 

, All is clear if' we:Ayr5tC''-,'''tttiIiring the'Tdatioh ('14) fn 'No, 9— 



ROBERT SCHMIDT 


55 


in the form 


Z 2L 
€1^ -h CL^ -h 


f- O.^ 




•+ a: 


paying attention to theorems 6, 8 and 9. 

If Mk is not much smaller than 1, the approximation of de- 
gree K will be estimated to be good. If T and are two types 
of frequency function, and the corresponding measures 

ft- 

of approximation, and if » we say: T is, for the de- 

gree K , 7~ (equivalence not excluded). If 

^ , we say : T^is, up to the degree K , better than T . 

Clearly we may base upon these concepts a method of curve- 
fitting. A full account will be given in a future note. 

15. Computation of the Tchebychef Coefficients 
If the vectors are already known, the finding of 

^ / * b according to their definition, very simple. But the 

actual calculation of is embarrassing, especially if 7 t is 

large. We already mentioned that this can be and should be avoided, 
and we recommend the following procedure. 

We form, just as in the proof of theorem 4, 

= 'to 

and to determine -the coefficients , we demand that the vectors y 
be orthogonal. Let % be an arbitrary subscript among ^ K . 
Then at least it must be true that 


f- 


and a fortiori 




+ < 


x-i 




for arbitrary values ^x-i '• are li: 


linear 



56 


ONE-DIMRNSIONAL DISTRIliUTlONS' 


combinations of - 


hence 


( 17 ) 

* For abbreviation let ns designate the moments of the best 
values 4^/*-) 

Obviously we have 




'■R P*-f 

and the equation (17) produces 


Ci%‘i = )' 


' Txo 

1 

* 

^ ^ aCj X-/ 

(18) ) rri -r -h -m Y +■■ 

^ ' /XiJ ^ 1 XI 

• 4- 


7-n. _ Y ■!■ yn. Y ' 

x~l txo X 'X! 


77h >r f ?;i «- 0 

^X-Z Ix,7C-f 


Conversely, from (18) follows (16) Jience the equations (18) 


must have exactly one solution ^ x-/ 


Concerning’ 

: the normalizing factor in 


, we have 



(y:x 


f * • ’ ^ 


4)’- 


= (rn^ 


y- 77^^., 

r 't 

rXjX'i 




4-( w, 


f 77?^ 

X f 

^^kh)Y, 


■f ( 

and from (18) ; 

yy-" 


1 

i 





■ f- 7« , 

-2; 

X-/ ^ 




With the abbreviations 




ROBERT SCHMIDT 


57 


For the calculation of ^ • we recommend operating 
according to the following recipe, in the demonstration of, which 
we confine ourselves to the most important case k- 3 . The modus 
procedendi for other values of the degree K will be clear. ^ 

1 . Compute 


In the interest of, the accuracy of the results it is advisable to 
take care that the equations 



are precisely or approximately satisfied. This will be the case .if 
^ hold; otherwise introduce 

instead of , with convenient constants ^ ^ /S, 

2. Compute 

Again it is useful to take care that the equations 
^ (70^ f-* * * X^) ss o j ~ ^ 

are precisely or approximately satisfied. This will be secured if 
distributed over nearly the same interval as 


Xs C 


n 

4n. 






3. 

Form the scheme 




^<JO 

^0! 

^02 


/'n. ’n 




Clff 


^-3 \ - 

/ 


m4 


oq, 

^3LX 

«-t3 1 

\ W, 


^5 

0.3^ 



0,5/ 

\ rrty 

7775 

W4 


4a. To every element of the second, third and fourth row in 
this scheme add the corresponding element of the first row multi- 
and - respectively, so that there results 


phed by 
u scheme 



58 


()XH-DJMBNS!ONAL DISTRIBUTIONS 


4b. To every element of the second and third row in the 
scheme 


< 

^iz 


ai, 





^33 


add the corresponding element of the first nne multiplied by 
— and - ^ 3 / respectively, so that there results a scheme 


af 



o 


CLff 

o 

^32 

(X 

B3 


4c. I'o ei^ery eieme^tf of the second row of the scheme 

I i 

add the corresponding element of the first row multiplied by - 
so that there results a scheme 


5, Extract 

6, Multiply the elements of the first, secotid and third row in 

the scheme ' / ^ rt ^ % 


a; 

a scheme 




^4? 3 

cT 

<z 


o 


a!* 


respectively, so that there results 


and extrart 


_ , ; 7., ' To every ekmenf of the first row in the scheme B add the 
torrespmidmfi element of ^ the second row nmlHplied by -P ^ so 



ROBERT SCHMIDT 


59 


that there results a scheme 

O 


f 




B - 


/ 

o 


fr. 


fZ 

i 


and extract 


^3 
^3 - 


izo 


I 




8. To every element of the first and second row in the scheme 
^ add the corresponding element of the third row multiplied by 
"“^2 “■ ^2 respectively, so that there results a scheme 


tf 


73 

6 


3 = 

( 

/ 

6 



VO 

C 

/ 

*^3 


and extract 

bo c 
9. Form 


ff 

ej 7 


% 


= -# 


'3 7 


Ts2 


^z3 


% 






•v 

3 


X 

L X 

r, 


x: 

r„ u.X'- X,, 


Al = a 


X ^ 


(f=: a 





Va. 

^3 = 

5 » 


K = 





60 


ONE-DI MEM SIGNAL DISTRIB U TIONS 


16 . Controls of Computation 
It is easy to point out controls for the process o£ evaluation of 
^ 3 ^ which do not require any considerable extra 
work, and yet indicate every occurring miscalculation ■ with almost 
absolute safety. Such general controls, of course, can not bear 
upon the ascertainment of 

A. Control of 777^ , 777 , , , rrt^ ^ ^ 

-h 3 ( 771 ^ ^ -h ■ 

lb Control of ^51 ’ 

^ 71 . 5 ^ 

m^t3i7T7^trn;)7 777^^ = ^ * 

C. Control Of X , K, )r •, 

X+3(:)C^X;)f x; = -^ J . 

D. Control of • 

E. Control of Y' i f i' . Y' ft * 

The operations indicated under 3 - 8 in No. 15 are essentially 

nothing else than the solution of three systems of liiiear equations 

for one, two and three unknowns resj^ectively, contracted into one 

uniform process of reckoning. Hence we can make use of the 

method of control by sums. We have to add the sums 
Ac ^ r- 4- cpL^ ^ 

% * ^^<3 -^ * * * f* »Ct33, 

as elements of a fifth column : 

\ 0.33 / 

and to transform this expanded scheme in the way described in 
No. 15. Then everywhere the sum of the four elenunits of 
each row must equal its fifth element. ■ If this is true for the scheme 
B ' especially, it is. practically impossible that f should 

hav« been wrongly computed. 

; ' ' F. Control of, 

'The computatidii, of ' should be performed ' by 

starting' from^ 'tlws' scheme 



mSERT SCHMIDT 


61 


/ 

T 

Uc 

Xc 

L 

£> 

/ 

%, 

L 

<3 

c 

1 


o 

e 

O 

1 



^ ^ tr 

iXf 

I 


with the meaning ]7o - 5^ 

*n, = 

^ r3i = 

I - s,. 

Multiply the elements of the first, second, third and fourth 

row by )C^ and respectively, and form the sums of 

the elements of each column. The sums of the first four columns 

are X we have the control % ^ 

where K designates the sum of the fifth column 

17. Examples 

1. Let the observations (r<t^3o) 

^ = ( -1.5, -1.0, -.7, - .5, -.3, 0, V.3, 6, +1.0, + 1.8) 

be given, and let us first assume the normal type. The normal law 

of error being symmetric, we have nr^^o , and in this case 

we are able to write down the Tchebychef coefficients required: 

a ^)C E- n - n = ^ 

^ ^ ^ ^ y rrr^- ^ ^ V 

Nevertheless we will proceed according to No. 15. But we will 

confine ourselves to give t^e resulting data of the different steps 
of computation only. A full reproduction of the complete process 
of reckoning is to be found in No. 19, dealing with a somewhat 
more general situation. 

In the Kelley-Wood tables we find 

4 . 4 = -1.03^^33 ^,= 

i ^ 

^.^,IZ5UI ^..iXSOUi t^^.3g-5^2e 


4,- 

V 


. 3953ZO 





62 


ONE-DIMENSWNAL DISTRIBUTIONS 


We obtain 


777^^0 


7r7^=:r.Slf7f 


717^ ^ O 




■m^-o 


77|= 

-.03000 X^^i-.073n X.^^lldsn 

'tT = . 97700', 




€L 

^>ox 


f ’ 

a 

-I-. 

h 



V/ 

0 

r,g797f 

0 


0 



0 

c 

t/ 

\ o 

o 

O 



d 

o 




.13717 


A^=: .1^31$ 


771 0 

+7.7¥i362 

c 

i- 

\^.SS575 


& 


•■'■6 


0 

1 


O 

/ 


o 
o 


) 


tc 

3 




%,-o ', 


■' = (i 


0 

1 
o 


o 

O 

! 


C 

o 


■} 


t ~o 

'3t 


^5, 


-’l.nsis 


T = o 
'32. 


/»/ = <a^ = -.03<5©o (f^a. = +.95<706 ajt=+ jojs? g^»+- . erf/o 

5 =. + K-r^- .on<?3 

^ = .‘??3S2 ^#2= .W53 %= 

For comparison we give the value which is furnished by the 

traditional concept of dispersion: 

ih cS = .<13600 . 

II Let the same observations as above be given, bnt now let 
asinine thfi' step type. We can make use of the vectors in No. 13, 

\ which 'give, at once ■ 

^ .nni 

Wet note that for our observations the normal type is, up 

’ to the d^rfe 3, /better than 'the .step type. ' 



ROBERT SCHMIDT 


63 


18 . Analysis of Frequency Groups 

In economic statistics, observations very often do not appear 

in the form dealt with in the preceding chapters. Instead, they 

usually are gathered into groups, so that there is given a set of 

values X, K { * * • ^ corresponding positive values 

, not- necessarily integers. If 4/ means the -.sum of 

^ . . . . ^ ^ the ratios 

i _ A'. . X - C = 

j-f-Tv’ ’ Tv -rT 

are called the "'frequencies'' of the "observations'^ * 

The frequencies satisfy 

We shall now have to extend our developments to make them 
applicable in situations as stated above. To anyone who is familiar 
with integrals and sums in the sense of Stieitjes, it is clear that no' 
special difficulty can arise. 

Again we have to start from a frequency function , and 
to agree which values should be designated as '‘best 

values'’. Reflections similar to those of No. 2 make it reasonable 
to choose 


> ? - 

Apart from the best values, we only have to modify the defini- 
tion of the product of vectors (No. 4), We define 

Ji^ tL, }/,f, -f ^2 V^4+-- - +1^ 

If these modifications are kept in mind, all the definitions, theorems, 

proofs and remarks of Nos. 4-16 remain unaltered. Of course, 
the abbreviations m^mid Xx now be read 

= A A u=c>^i,. 

Xy ^ ■■■■■ - *c). 


) 


and the "controls A - D (No. 16) : 

'rt 

A. - . , -f- 3 [rr?^ ^ 





64 


ONE-DIMBNSIONAL DISTRIBUTIONS 


B. 

i- 3(7r?^ -h t ~ 


C. 



D. 

/ -I- 



We are now in the position to illustrate the mechanism of 
skewness and ktirtosis still more impressively than in No. 13. For 
this purpose we start irom the frequency curve represented in Fig. 
3 ; we choose and 


+.2 +.4 4.6 +.8 


Then the test values become equidistant, and they are given by the 
abscissae of the points marked by small circles : 


-. 8 , - 6 , -. 4 , -. 2 , 0 , -*-. 2 , 4 - . 4 , ^ M, 

The ordinates ri^ of these points are proportional to d(, , namely : 



The table below^ gives the corresponding vectors ^3 » 

and also the vectors ^ and exam- 

ples of distributions which show skewness or kurtosis in all purity. 















ROBERT SCHMIDT 


65 


As in No. ISj, let us illustrate the relations between the vector 
and the vectors ^ by means of some fig- 

ures. This time however, we shall not only consider the components 
of the vectors, but also operate with the values . We do that by 
associating every vector Cw^^***^ u^) with the system of points 

^ ( ‘^■n. >/•>.) - 

Thus, in the figures 4a - 5b,, the vector is every time associated 
with the system of points marked by crosses, whereas the system 
of points marked by circles successively correspond to the vectors 




The statements in No. 13 concerning the meaning of skewness 
as overconcentration to the left or to the right, and of kurtosis as 
overconcentration or underconcentration near, the mean ^ should he 
recognized. ; . . , ' ' ■ 

Until now, the values were supposed to-be, really , positive,' 



66 


ONE-DIMENSIONAL DISTRIBUTIONS 


but there is no diffictilty in allowing some of them to equal zero. 
Then, it is true, the formulation of some ioterniediary theorems 
iTiust be changed. Yet, the existence and the main properties of the 
Tcliebychef coefficients remain untouched, and their values are in- 
dependent of those for mhich the corresponding are equal 
fa oero. To know this is sometimes useful in order to get a scheme 
of compiitation of the highest possible uniformity. 

19. Example 

To conclude, we reproduce the reckoning of an example, fre- 
quently discussed, concerning observations of the right ascension 
of the pole star (see: A. L, Bowley, Elements of Statistics, 4th 
ed., p. 255). The given data are 



-7 -6 - Sj -4 

“3 -2 

~1 

0 

+1 

+ 2 

4-3 

4-4 

+ S +6 


1 6 121 21 

36 61 

73 

82 

72 

63 

38 

.16 

5 1, 


and the normal type shall be assumed. 

%■ 

Because the function , ' - 

e = 

satisfies /4 = = 1, it will be suitable to start from the best 

values of this specimen. These best values stretch from 

-3 tot 3 approximately. In order to have the values , with 
which we intend to work, in coextension with C * • * * 4,,. .we 
choose 

L ' i i.e. C ‘^ = ■ 'j • 

Between the means M , the dispersioi^s CT, <r* of the 

If 

observations there exist the connections (theorem 9) 

zM , 

whereas the measures of skewness and kurtosis as well as the 
measures of approximation do not change at the transition from 
to (theorems 11 and 12). 



ROBERT SCHMIDT 67 


Computation of ; X,/ ■ - , X3 J 


u 

4. 



C£ 


I 

-3.08242 

.00205 339 

-.00632 941 

.01950 990 

- .06013 77 

2 

-2.39928 8 

.01232 033 

-.02956 002 

.07092 300 

- .17016 47 

3 

-1.93176 9 

.02464 066 

- .04760 006 

.09195 232 

- .17763 06 

4 

“1.54996 6 

-04312 11 

“ .06683 62 

.10359 38 

- AmS6 69 

5 

'1.17951 6 

.07392 20 

' .08719 22 

.10284 46 

- .12130 67 

6 

^ .77663 9 

.12525 7 

' .09727 9 

.07555 1 

- .05867 6 

7 

- .36846 6 

.14089 7 

-.05523 2 ' 

.02035 1 

- .00749 9 

8 

+ .03861 3 

.rm; s 

f .00650 2 

.00025 1 

.00001 0 

9 

f .44963 0 

.14784 4 

f .06647 5 

.02988 9 

r .01343 9 


t-. .88571 7 

.12936 34 

t-. 11457 94 

.10148 49 : 

t- .08988 69 

li 

■M. 37743 5 

.07802 87 

10747 95 

.14804 60 

t- .20392 37 

12 


.03285 421 

t .06240 756 

.11854 503 

^ .22517 98 

13 

t 2.44778 6 

; 01026 694 

t .02513 127 

.06i51 597 

f .15057 79 

14 


.00205 339 

+ .00632 941 

.01950 990 

.06013 77 



+ 1.00000»W. 

-. 00113 =^ 777 , 

+.96397== W;, 

-.01283- Wj 


1/ 



Cf. 

I 

.18536 96 

- 

.57138 7 

1.76125 5 

2 

.40827 31 

- 

.97956 7 

2.35026 3 

3 

.34314 13 

- 

.66287 0 

1.28051 2 

4 

.24887 3 

- 

.38574 5 

.59789 2 

5 

,14308 3 

1 

.16876 9 

.19906 6 

6 

.045^7 0 

- 

.03539 1 

.02748 6 

7 

.00276 3 

- 

,00101 8 

.00037 5 

8 

.00000 0 


.00000 0 

.CH)000 0 

9 : 

.00604 3 

+ 

.00271 7 

.00122 2 

10 

.07961 4 

•f 

.07051 5 

.06245 6 

11 

.28089 2 

4 “ 

.38691 0 

.53294 3 

12 

.42773 58 

+ 

.81249 7 

1.54336 2 

13 

.36858 25 

4- 

.90221 1 

2.20841 9 

14 

.18536 96 

t 

.57138 7 

1.76125 5 



■ 


■f'12,32651-77r,- 













68 


ONE-DIMEMSIOMAL DISTRIBUTIONS 


1/ 





BBW 


1 

- 3.5 

- .00718 687 

.02215 29Si 

- .06828 47 

.21048 2 

.02515 4 

2 

-3,0 

- M3696 099 

.08868 006 

- .21276 90 

.51049 4 

.11088 3 

3 

-2.5 

- .06160 164 

.11900 014 

“ .22988 08 

.44407 7 

.15400 4 

4 

-2.0 

- .08624 23 

.13367 26 

- .20718 80 

.32113 4 

.17248 5 

5 

-1.5 


.13078 83 

- .15426 69 

.18196 0 

.16632 4 

6 

-1.0 

- .12525 67 

.09727 9 

- .07555 1 

.05867 6 

.12525 7 

7 

- ,5 

-.07494 9 

..02761 6 

- .01017 6 

.00374 9 

.03747 4 

8 

.0 


,mm 0 

.00000 0 

.00000 0 

,00000 0 

9 

•«- .5 


.03323 8 i 

+ .01494 5 

.00672 0 

.03696 1 

10 

fl.O 

t .13936 34 

.11457 94 

+ .10148 49 

.08988 7 

.12936 3 

11 

•fl.5 

f .11704 31 

.16121 93 

+ .22206 91 

.30588 6 

.17556 5 

12 

^2.0^ 

+ .06570 842 

.12481 512 

+ .23709 01 

.45036 0 

.13141 7 

13 

f2,5: 

f .02566 735 

.06282 818 

+ .15378 99 

.37644 5 


14 

•f3.0 


.01898 820 

+ .05852 96 

.1804! 3 


_J 


- .08522= X, 

+1.13486- X, 

MM 

+ 3.14028=Xj 

1 1. 34754 


Controls A - D. 


u 


00 ’ 





1 


- 9.0304 

- ,01854 3 

t .54306 S 

+ .06490 0 

.01283 4 

2 


- 2.73982 

- ,03375 5 

f- .46622 0 

•f- .10126 6 

.04928 1 

3 

- .93176 9 

- .80896 

- ,01993 3 

4 .14369 S 

f- .04983 3 

.05544 1 

4 


- .16634 


^ .02670 9 

+ .01434 6 

.04312 1 

5 

- .17951 6 

- .00579 


+- .00070 2 

.00064 2 

.01848 1 

6 

+ .22336 1 

4- .01114 


- .00065 4 


.mm 0 

7 

^ .63153 4 

+ .25188 


- .00188 9 


.03747 4 

8 

f 1.03861 3 

+ 1.12037 

+ ,18864 5 

4- .00001 1 


.16837 8 

9 


+ 3.04629 


f .04093 9 

+■ .22518 8 

.33264 9 

10 

f 1.88571 7 

+ 6 70548 

M744 3 

^ .60273 4 

M744 4 

.51745 4 

■ 11 

^2.37743 5 

+ 13.43773 


■f- 2.74027 2 

1-1.57279 4 

.48767 9 

12 

f-2.899S3 0 

+24.377r4 


r 5.48924 0 


.29568 8 

13i 

t3M77B 6 



f 6.17137 8 

BES^B 

,12577 0 

14' 

UM242 



r 4,mm 2 

.41912 6 

.03:^5 4 




+ 3S7S70 

4-20.31408 

f 5.94902 

2.17710 


-f* 5 f W 3 '== 3;87S69 

WLf-3£'Wa#-tr?5)^-m£, = 20.31408 

H%,i~ XJ-f" X 3 = 5.94901 

' . , 'g,*' =. 2J77iO 























ROBERT SCHMIDT 


69 


Computation of - (twice underlined) 


and of 

A.,-- 

' 7^3 > with control by sums. 

1 

^ .00113 

-J- .96397 

-'.01283 

1.95001 

- .00113 

^ .96397 

- .mm 

f ,96397 

' .01283 
t .00109 
- .01174 

f2. 72531 
- .00001 
2,72530 

+ 3.67532 
+ .00220 
+ 3.67752 

•t .96397 

" .01283 
+ .00109 
- .01174 

+ 2.72531 
- .92924 
+ 1.79607 

- .05851 
+ .01237 

- .04614 

+ 3.61794 
- 1.87975 
+ 1.73819 

* .01283 

•f' 2. 72531 
- .00001 
f 2.72530 

- .05851 
+ .01237 

- .04614 

+12.32651 
- .00016 
+12.32635 

+14.98048 
+ .02502 
+15.00550 


+ .96397 

-.01174 

+2.72530 

+ 3.67752 


- ,01174 

+ 1.79607 
- .00014 
4-1.79593 

- .04614 
+ .03319 

- .01295 

+ 1.73819 
■+ .04479 
+ 1.78298 


-h 2.72530 

- .04614 
+ .03319 

- .01295 

+12.32635 
- 7.70486 
+ 4.62149 

+ 15.00550 
-10.39694 
+ 4.60856 



+ 1.79593 

- .01295 

+ 1.78298 



- .01295 

+ 4.62149 
- .00009 
+ 4,62140 

+ 4.60856 
+ .01286 
+ 4.62142 


A, = .98182 

=1-34012 

A =2.14974 



ONB-DIMENSl ONAL DISTRIBUTIONS 


Compulation of control by sums. 


YT, =i-. 00113 


1 • .00113 ^ .96397 - .01283 

-h .00113 .00001 4" ,00319 

1 ' 0 4 - .96396 - .00964 

•M.950D1 
■f ,00431 
t- 1.95432 

1 - .01218 + 2:82716 

+■2.81497 

1 - .00721 

+• .99279 

= - .96396 

T.j =■*'■01218 

10 .96396 - .00964 

- .96396 .00695 

1 0 0 - .00269 

+•1,95432 
.95701 
+• .99731 

1 * -.01218 f- 2.82716 ' 

4- .01218 - ,mB9 

1 0 4*2.82707 

4- 3.81497 
+' .01209 
t 3.8271^ 

1 - .00721 

4* .99279 


■f-io = + -«)269 
%, -- 2.82707 
r,^ = ^ -“721 

Computation of , with control by sums. 


1 

,00113 

1 

- .96396 
4- .01218 
i 

+ .00269 
2.82707 
+ ,00721 

1 

+ .03986 
- 1.81489 

1 4 1.00721 
4 1,00000 

-- MS22 

- .00010 
-ti.ism 

4^ .01382 
- .17021 

- .(KK)23 
*-3.20833 

- ,00123 
4- 344028 ‘ 

4 .00340 
- 2.0S96S 
.17144 
4“ 344028 

- .08522 

+143476 

-.07425 

- .06951 

+ .'90579 



ROBERT SCHMIDT 


71 


Finishing computations. 

/»/ = =. .08522 <rr £2, = 1 . 1SS77 a^= - .0SS41 a = - .03234 

/i*=.lT044 = 2.31i54 5= -MW K= - ozW 

^ = 1.34754 *1.33580 1% = .99666 = .99533 

= .00726 = .00307 

t-o^~ 1.34028 «.,>< - 1.33887 .99895 .99947 

0-3 - .00105 

= 1.33992 .99973 .99987 

So long as we pay regard to the Tchebychef coefficients 
<ig only, the purport of our results is that the observations 
are somewhat overconcentrated to the right, and somewhat under- 
concentrated near the mean. The sum of the squares of the Tcheby- 
chef coefficients with higher subscripts than 3 is 

+ oj = ( <-4- -00036; 

it is small compared with ^ — .00307 and = .00105. The 

vectors * * • • being normalized, we are sure that the 

influence Qf cannot essentially disturb our statements. 

Finally we give an illustration by computing and drawing the 
“best curve^’ of the normal type, corresponding to the observations 
. With it we mean that curve j ? the best values 

of which are the components of the vector 
values Y',/3 (see No. 2) have to satisfy 

substituting 

we get. ' 





ONE-DIMENSIONAL DISTRIBUTIONS 


hcEice 

4 *^ =-tl. 17717 

3= ^ i- j =~ .08389. 

Ao 1^1 

With these values T and B , the curve in Fig. 6 represents the 

' 4 , 

fnnction _ (%-/$) 

t ~ — 3 ^^ 

^ ^ ■ 



The abscissae of the points marked by circles are the observations 
, their ordinates are equal to the corresponding divided by 
the length 0.5 of the group intervals. 


University of Kiel, Germany. 



EDITORIAL 


A New Type ©f Average f©r Security Prices 
The market averages that are most popular with the American 
investing public are essentially weighted or unweighted means of 
security prices at designated intervals. As a rule, they ignore 
the volume of sales — ^an element to which experienced traders 
attribute considerable importance. Such averages endeavor only 
to reflect the average price level at periodic intervals, and all of 
those published are entirely satisfactory in this respect. 

In this note we shall discuss an acquisition average which, in- 
stead of being concerned with the price level^at a given moment 
attempts to answer the question, ^Vhat-is the average price actu- 
ally paid for the securities by their present owners/' 

The problem can best be appreciated by presenting two ex- 
amples of acquisition averages prior to the mathematical theory. 
The first entry of Table 1 states that for-the week ending January 
7, 1928 United States Steel common closed at 150 6t8, and that 
the acquisition’ average on this date was $137,75. At the time 
of the market crash in October, 1929, the acquisition average had 
risen to about $212, and at the present moment this average has 
receded to about $48. Of course, some of the individuals who 
bought Steel at about $200 per share are still holding on to 
it, whereas others among the present holders obtained theirs in 
1932 at less than $25 per share. According to our theory, the 
mean of such acquisition prices is the $48 noted above. 

As an illustration of corresponding averages computed on a 
daily basis, table 2 presents the daily closing prices and acquisi- 
tion averages for Auburn, covering the last half of ^1934. This 
stock was selected because of its relatively small 'capitalization and 
frequent activity. 





1523 

w • 05e 


. ! 1 

A. A. ! 

1930 

C' -u... 


'i ‘T 



I-" 5 

1 61 .6 

^ iro ( 

i Jiw.t/*-!- * 

l-“ 4 

iCa 



1 

33, ? '* 

12 

1 66.3 

153.09 ! 

11 


I.,,.'!' 

21 


M pn 1; 

14 

1SS.4 

1S6.44 

,io 


j lt.1 

21 

146 6 

V ;| 

26 

1S7.1 

260.23 1 

25 

c-,;- 1 

iJ9Hi 

n' 

ViC.H- 

16?,42 1; 

2- 2 

'i847 

162.07 

2~* ' 



!» 

145.7 


9 

1731 

163.17 

8 

2Si.t ) 


18 

140,0 

139.70 i 

16 

169.3 

163.64 

15 

lSf,0 

183.37 

25 

140.0 , 

13971 

23 

182.0 

164.54 

22 

1S3.0 ; 

188.15 

3-^. 3 

140.2.^ 

138,00 

• 3- 2 

188,4 X 

165.62 

3- 1 

184, 3 X 

186,21 

. 10 

im 

USJ3 

9 

185.2 

166.98 

8 

181.0 1 

186.07 

17 

147J 

138.99 

16 

188.0 

168.58 

IS 

179.2 

185.88 

24 

147.5 

339.83 

23 

181.1 

169.51 

22 

187.6 

ISS6S 

31 

4- ^7 

14 

21 

28 

5- S 

12 

19 

26 

'6-2 

9 

16 

!47.3 

!4?..1 

ism 

145.5 
145.3 
1481) 1 

148.6 
'145.6 

146.7 
145.2.^ 
140,6 
138.5 

140,35 

MOi! 

141.09 

141.34 ' 
14L48 

141.58 

141.90 

142.10 

142,22 

140.70' 

i40.S2 

14075 

30 

4- 6 

13 

20 

27 

5- 4 

11 

18 

25 

6- 1 

8 

15 

183.6 

186.3 

188.5 
186.0 
186.2 
182.1 ■ 

179.4 

174.6 

167.7 

165.0 X 

168.0 : 
175.SA 

170.46 

172,06 

173.52 

374.41 

174,79 

175.20 

175.33 

175,38 

175.25 

173.28 

173.16 

169.40 

29 

4- 5 

12 

19 

26 

5- 3 

10 

17 

24 

31 

6- 7 

14 

193.4 
'196.4 
193,1 
1952 
188.0 
1.702 

172.6 

172.7 
172,0 
1715 X 
1642 

162.4 

185.97 
!86i0 

186.98 

187.50 

187.51 
186J3 
18SJ5 
18SJ© 
184 JO 
18187 
182.48 
18L42 

23 

133.2 

140.52 

22 

180,6 

169.80 

21 

1SS.2 

179,79 

30 

237.3 

140,39 

29 

190.6 

170.90 

28 

156.0 . 

17152 

7. 7 ! 

140.0 1 

140.35 

7’^ 6 

196.3 

172.43 

7- 5 

,157.7 

177.99 

14 : 

136.0 1 

140.26 

i 13 ,' 

202.3 . 

174,63 

. 12 

160.6 

177,46 

21 i 

'138.2 I 

I4(U6 } 

i ■ 20 

207,7 1 

177,48 

19 i 

166.6 

177,03 

28 1 

144.2 i 

14071 f 

1 27 

206.0 ! 

179.8S 

26 ! 

169.7 

176.83 

^ 4 I 

140.3 1 

141},2» 1 

1 3 

MA 

182.20 i 

8-2 

Wi2 ; 

> 176,61 

J1 j 

1427' 1 

1411.32 1 

1 10- 

218.0 

185.68 1 

9 1 

!5ft4 : 

! 17(5,03 

18 1 

149.0' I 

1407S i 

! ^7 

238.5 

192.17 1 

4 6 i 

,!65.:i 

! 175i2 

25' 

IS'L2 i 

14'1.^5 } 

1 24 . 

1258.2 

1,97.54 1 

23 

168.2 

1 17529 

9- 1 

154.0;t^ 

141151 

31 

1256.4 X 

; 198.64 1 

30 

171.2 X 

^ J73JS 

B 

1557 i 

i4L54 

9^.. 7 

1247.4 

^202,19 1 

9-6 i 

1711 

^ 173J4 

15 

22 

29 
'lil- 6 

13 

20 1 
27' 
!i-*.3 

10'''' 
'' 17',‘ 

' , 24^'; 
iZ- f ; 

-S'l 
"15' ‘ 

22 

29 

159.0 1 

157.6 1 

159.2 1 

158.3 ^ 

164.6 

162.0 ; 
I261.S i 

169.7 
1«,3 
1717 
''167.6 ■' 
i65J^, 
''1512 
"ISI.i , 
IMM. 
1'S'94 " 

142.71 . 

143,62 
145.12 
,146.31' 1 

147.56 1 

14B.26 ! 

149.16 '• i 
149.68 
150.34, ' 

152.26 

152.95 ■' 
1SI.SI) '. ; , 

151.95 ' 
•,'!5,I92 
,15157:- ,,' 

14 i 
21 ‘ 
28 ■ 

! UU 5 

12 

19 

' 26 
11- 2 

9 

■ 

■ 23 ■' 

I ' 30 , 

I 12-^ 7' 

I' . 14" ^ 

■ 21' '1 

■ ■ '28 ' 

1233.2 

232.1 , 

225.0 : 

217.6 

230.6 

209.0 
203.4' 

193.2 

171.0 

164.2 

167.0 ■■ 
1.62,1 X 
tS2.6 

174.0 

lao 

164.4' 

I20S.43 i 

207.70 1 

210.12 i 

211,40 1 

21Z56 \ 

213.41, 

212.38 

2IL06 

209,67- 

207,09 

205,42 

201.81 

200.30 

i96.S7 

1951,1 , 
193.76 ' 

23 

, ' 21) : 
27 

1 10- 4 

11 

18 

I 2S 

1 11- 1 

1 - ' ^ 

1 IS 

1 ■ 22 

I 2'9 

1 l'2-"6 

1 13 i 

1 . -"37- i 

1712 

163.7 

1582 

156.6 
148.4 

145.3 

151.4 

145.6 

140.4 1 

147.7 : 
':1472 

145.4 
!42.3 X 

136.4 ' 
1415' 
136i ', 

1 

J73i)f) 

17221 

mil 

169A6 
168.25 
,167.23 
166, .3§ 
165.39 
163.M- 
16312 
162.59 
,160,38 
159,24 
15819 
157.73 






B, C. CARVER 7S 


TABLE 1 — (Continmd) 


■ 1931 

Close 

A.A. , 

1932 

Qose 

BBli 

1933 

Qose 

A.A. 

1- 3 

143.4 

157.38 

msBi 

37.1 

83.57 

1-7 

29.7 

46.40 

10 

143.7 


9 

42.7 

80.59 

14 

29.6 

46.10 

17 

139.1 

156.65 ^ 

16 

44.1 

78.50 

21 

28.5 

45.90 

24 

142.3 


23 

41.4 

77.00 

28 

27.7 

45.74 

31 

139.2 

155.99 


37.3 

75.28 

2- 4 

26.3 

45.54 

2- 7 


155,67 

2- 6 

38.5 

73.93 

11 

28.3 

4526 

14 

145.2 


13 

49.0 

72.23 

18 

26.7 

45.20 

21 i 

148.6 



*48.4 

70.84 

25 

24.5 

44.97 

2S 

147.4 

154.87 

27 

47.0 

70.20 

3- 4 

262 

44,68 

3- 7 

146,6 X 

152.95 

3- 5 

50.6 X 

69.00 

1! 



14 

144.4' 

152.82 

12 

46.7 

68.44 

18 

30.6 

44.33 

21 

147.4 

152,65 

19 

41.5 

67.33 

25 

28.5 

44.13 

28 

I4L4 


26 

402 

66.50 

4- 1 

27.5 

44.00 

4-4 

140.0 

152.23 

■ 4- 2 

39.0 

65.47 

8 

30.3 

43.82 

11 

137.3 

151.80 

9 

34.6 

64.16 

IS 

32.4 

43.57 

18 

132.6 

151.22 

16 

332 

63,15 

22 

42.3 

43.23 

25 

125.5 


23 

29.1 

62.33 

29 

46.5- 

^4320 

S-2 

115.2 



282 

61,60 ' 

5— 6 

46.7 ' 

43.48 

9 

111.5 

146.24 

• 5-7 

30.0 

60.80 

13 

47.4 

43.66 

16 

101.5 

143.97 

14 

27.0 

60.29 

20 

47.2 

43.78 

23 

98.4 

141,33 

21 

. 29.0 

59.81 

27 

53.0 

44.05 

30 

91. 


28 

r 272 

59.34 

6- 3 

52.1 

4429 

(S-6 

89.3 

132.63 

6-4 

302 

58.58 

" 10 

55.4 

44.58 

13 

90.7 


U 

2625 

57.57 

17 

53.1 

44.97 

20 

92.7 


18 

25.4 

57.04 

24 

57.4 

45.34 

27' 

104.3 

125.91 

25 

23.4 

56.62 

7- 1 

59.7 

45.84 

7-4 i 

105.0 

124.96 

7-2 

23.6 

56.13 

8 

65.2 

46J2 

li 

96.4 

123.23 

9 

21.6 

55.83 

15 

64.2 

46.88 

18 

94,4 


16 

23.3 

55.43 

22 

52.2 

47.^ 

■ 25 

90.2 


23 

24.7 

55.18 

29 

54.3 

47.49 

8- 1 

85.7 

118.79 

30 

28.7 

54.51 

8- 5 

51.4 

47.56 

8 

86.0 


8- 6 

41.4 

53.70 

12 1 

53.4 

47.60 

15 

93.0 

116.89 

13 

37.4 

52.86 

19 

52.7 

47.66 

22 

87.4 

116.23 

20 

40.7 

52.38 

26 

58.4 

47.79 • 

29 

90.5 

115.66 

27 

48.3 

51.91 

9- 2 

552 

47.91 

SL 5 

83.0 X 

113.91 

9- 3 

51.4 

51.81 

" 9 i 

51.5 

47.96 

12 

80.5 

112.74 

10 

48,6 

51.73 

16 

55.0 

48.05 

19 

75.2 

110.78 

17 

38.7 

51.24 

23 

49.4 1 

48.17 

26 

77.1 

109.00 

24 

45.4 

50.82 

30 

45.S : 

48.16 

10- 3 

68.4 

107.16 

10- 1 

43.7 j 

50.63 

10- 7 

47.3 

48.14 

10 

70.7 

104.91 

8 

35.4 

50.18 

.14 

43.2 '' 

48.11 

17 

68,6 

103,87 

15 

37.7 ; 

49.76 

21 

352 

47.82 

24 

71.S 

102.97 

22 

35.1 

49.39 

28 

392 

47.57 

31 

67.4 

101.79 

29 

352 

49.12 

11- 4 

#.S 

47.43 

11-7 

72.3 

100.88 

11-5 

35.0 

4&87 ’ 

11 

422 

47.35 

14 

67.7 

^99.85 

12 

392 

48.53 

18 

432 

47.26 

21 

60.6 

S@.59 

19 

36.1 

48J2 

25 

45.0 

47.21 

28 

53.5 

96,92 

26 

32.7 

48.16 

12-. 2 

44.6- 

47.18 

12- 5 

54,0 X' 

#.36 

12-3 . 

30.4 


9 

, 47.4 

43.17 

12 

44.0 

#.78 

10 

322 

47.61 

16 

45.6 

47.16 

19 

41.4 ’ 

8?.09 

17 

30.1 

4723 

23 

47J 

47.15 

26 

,37.6 i 

85.55 

24 

26.5 

46.95 

30 

47.6 

47.16 




31 

27.4 

46.63 




















76 


EDITORIAL 


TABLE 2. 


Daily Closing Prices and Acquisition A^herages for Auburn^ 

July 1st— December 30th, 1933. 

















H, C. CARVER 


77 


We shall now develop the theory on which the preceding tables 
were constructed. As a simple illnstration let us suppose that ICM) 
individuals start an enterprise, that a total of 1(X) shares of stock 
are issued, and that each of the individuals purchases one share 
for $100. The total book value of the issue at the date of issue 
is therefore^ 

■ = ^/OOX (00= ^(0 OOO, 

and the acquisition average then is 

4 . = -& = ^ 100 , 00 . 

If the first transfer of stock resulted from the sale of a single 
share at 150, the total amount paid by the group now owning 
ail the issue is obviously 

<f‘f(lQO)-(- (50 =10 050, 

and the new acquisition average is 

Ai - ~i^ ~ 0 “ ( So ) A > *• 100.50 . 

If somewhat later the next sale of stock is a single share at 

50, we may assume that 

\/= ’J^(lOO.50)f50 = 
and consequently 

Our first assumption is, therefore, that whenever the sale of a 
share of stock is recorded, it is equally likely that my one of the 
previous holders sold the share. More will be said of this assump- 
tion later-' 

In generalizing, let us adopt the following notation : 

L- designates the number of share units listed for an issue 

At, is the acquisition average at a given initial date. 

denotes the price at which the unit of stock is 

sold following the initial date. 



78 


EDITORIAL 


is the acqiiisitioE average immediately after the sale of the 

x-tit iinit. 


We have then that 

4 - Cl- f ) = h- O-ty p. 




Cl)' 4=- C'-r)4f f-^rA PO'tf 

If 'we multiply both sides of this last equation by 0-^) 
then subtract the resulting equation from (1), we obtain 

4 = M'- ff-A 0- if ^ ^ 

^O'zKa^'fj)*- 


( 2 ) 


We shall now make a second assumption, namely, that the' 
prices vary linearly from 0 to )c .To illustrate, if Steel closes 
one week at 54 and during the next week 1(X),000 shares are sold 
after which the close is 59, our assumption means that after 
20, (XX) shares were sold the quotation is 55, at 40, (XX) shares 
the price is 56, etc. Actually the price trend between two dates 
is not a straight line but rather a scattering of points. However, 
the linear assumption introduces compensating errors which have 
been found to result in only negligible variations in the resulting 
acquisition averages. We may write, therefore, 


( 3 ) 


^ = -fif 

fit\ ' f L ic 


and equation (2) then reduces fo 

Bvit since in practice both /_ X' are large integers we 
''may' '.write' ' ■ 



79 


H. C. CARVER 

(5) X= -^-7 

and (4) then' becomes 


(6) ^ - -h y 

where 

—X -X _;;n. _X 

(7) =,-= e , , 

Tables of ‘y , /5 and y have been computed for the 
interval — rate-of-turnover, A . With the aid of these, tables 
1 and 2 are readily extended. A slight difficult}' is encountered 
in determining the acquisition average at an initial point. At the 
outset it is necessary to assume two initial acquisition averages, 
one equal to the “high” at some point in the past, the other equal 
to the corresponding “low.” The true acquisition average cer- 
tainly lies between these two limits. It is necessary to start com- 
putations sufficiently far before the date of the first desired 
acquisition average so that the two series derived respectively 
from the highs and lows will converge to a single average. The 
length of the past experience period required will depend upon 
the rate of turnover of the stock. The activity in grains is fre- 
quently so great that the two series will converge over a period 
of two weeks. 

I wish to point out emphatically that this acquisition average 
is an average and nothing more. Like any other average its 
value depends largely upon the ability of the individual. using it. 
Although the use of this average might prove of value to an 
investor, it can not rightly be said that this' is a forecasting 
formula. I doubt the existerice of any valid method of forecast- 
ing^-mathematical or otherwise. The acquisition average merdy 
haasures ^^coMdory phenoitrena, and provides a tool for re<;x>gmz- 



80 


EDITORIAL 


itig ao unfavorable condition that might very easily be changed 
into a favorable situation by any one of mimerous causes*, 
Thus, if the market quotation is greater than the acquisition 
average, it follows that the average owner of the stock in question 
has a “paper profit/' Moreover, since' a sale is made when the 
owner of the stock and the prospective purchaser can agree on 
a price, and because of the peculiar psychology usually affecting 
one possessing a paper profit, the excess of the market price over 
the acquisition average tends through bidding to increase both 
prices and acquisition averages. This vicious circle carries prices 
too far in either direction until some “impressed force'' changes 
the trend abruptly. 

Since the price of a security at a given time depends upon 
the status of the entire market as well as the intrinsic value of 
that security, it follows that a general average for the acquisition 
figures for a number of the “market leaders" would probably be 
of value to certain investors. In fact, any of the popular market 
averages can be accompanied by corresponding acquisition aver- 
ages. 

' Since in many cases fifty, percent of the stock is kept to pro- 
tect control, it is evident that one might ' be justified in using 
one-half the share units listed for. the value of L in formula 
(5) for A . Again, if one desires to investigate the status of the 
group oi^erating on margins, the ambunt of the “floating stock" 
and' the brokers' loans must be taken into consideration in deter- 
mining ,A,, 

In conclusion let me point out that under the most favorable 
conditions our method of determining the acquisition average can 
do.no more than a,'lCX)% successful questionaire mquirmg' of 
'stockholders the price, at which each share was .purchased. Of 
yoiirse a.ll stockholders would not give such 'information if they 
-couIdi'aiKlcouIdn't 'If,, they would. ' , 



ON A NEW METHOD OF COMPUTING NON- 
LINEAR REGRESSION CURVES* 


By 


Walter Andersson, 
fit. dr, Stockholm. 


In a memoir published in this journal in February 1930^ Pro- 
fessor S. D. Wicksell pointed out that the well-known Pearson 
method^ of computing skew regression curves by adopting the 
principle of least squares can be simplified, and in some direction 
generalized, by inserting some assumption concerning the distri- 
bution function of the population studied. After some remarks 
on the subject as advanced in the said memoir the problem was pre- 
sented to me by Professor Wicksell. The results obtained by me as 
regards this problem were published as a part of my doctor thesis.® 
In the course of the official ventilation of my thesis Professor 
Wicksell made some interesting remarks concerning the relations 
between my solution and the general Pearson solution. His sug- 
gestion has led me to take up this special problem, which will be 
considered in the following lines. 

I. We consider a bi-variate distribution and denote the varia- 
bles X and y . ' The distribution function — for the sake of sim- 

♦From the Statistical Institution of the University of Lund, Sweden. 

D. Wicksell, Remarks on Regression. 

® Karl Pearson, On the General Theory of Skew Correlation and Non- 
Linear Regression; Mathematical Contributions to the Theory of Evolution 
/ Drap. Comi). Res. Mem., Biom. Ser. 11, 1905. 

^Walter Andersson, Researches into the Theory of Regression, chap- 
ters IV-VI, / Kungl.. Fysiografiska Sallskapets Handlingar, N. F. Bd* 43, 
Nr, I; also, as Meddelande Mn Lunds Observatorium, Sen IL Hr. 64 / 



82 


NON-LINEAR REGRESSION CURVES 


plicity being supposed discontinuous — may be 


( 1 ) 2 = 

SO that 

(2) 4- Z. = I. 

Let be the regression function of y on X . Thus 

(3) y, = 


where denotes the mean value of the dependent variate ^ for 
a fixed value of the independent variate x . Consequently we have 


(4) 




T 


Fc 




f 


We further observe that the marginal distribution of x. is 


(5) y:<rx)= ^ Fu,^). 

Expanding the regression function in the series of Tcheby- 
cheff we put 


( 6 ) gCx)^ X- ■ • • , 

^ r trL 

where if4 {x) are polynomials of the I orders, fulfilling the 
following condition of orthogonality 

(7) 2/Cx)* C^) ’ /or? 

and 


^ IT. (x) 

From (7) and (8) it may be shown that the expansion by 
Tchebycheff carried to some order gives the same approximate 
expression for the regression as obtained by fitting a parabola of 
the same order to the mean values of ^ for every value of x , 

* Collected Works, Yol. I, pp. 203-230. 



^VALTER ANDERSSON 


83 


each observation being allotted a weight proportional to the num- 
ber of individuals possessing the value of x in question. Thus, 
by using the series of Tchcbycheff in treating the regression prob- 
lem we have as a matter of fact applied the same method of de- 
scribing the regression as applied by Yule" and Pearson.” 

We observe that using the series of I'chebycheff we gain the 
advantage of being able to perform the graduation successively for 
the higher orders. With respect to this circumstance I have used 
the notation successive regression coefficients for the coefficients 
of (6). 

Working out the solution for these coefficients we obtain from 


(7) and (8), 

(9) 


^ fW) 


the polynomials being determined from (7). 

The successive regression coefficients — except — have been 
shown / see W. Andersson, Op. cit., pp. 14-15 / to be independent 
of the zero-values of the variables, and in some cases they are 
found to stand in simple relations to the well-known semi-invariants 
of Thiele/ Especially when the distribution is assumed to be gen- 
erated according to the hypothesis of elementary errors the semi- 
invariants of Thiele and the successive regression coefficients are 
closely related. In this respect the denomination semt-invarmni 
regression coefficients may be suggested for the coefficients The 
values of these coefficients ought to be derived in ail more exhaus- 
tive studies of curved regression lines. 

° G, U. Yule, On the Significance of Bravais' Formulae for Regres- 
sion, &c., in case of Skew Correlation / Proc. Roy. Soc., Vol, 60, pp. 477- 
489, 1897 A 

^ Pearsm, Op. cit. 

^ T. JV". ThieU, Theory of Observations, London 1903, p. 24. / See 
also' Annals of 'Mathematical Statistics, VoL II, pp. 165-307, where this 
■work of ' Thiele is reprinted/. 



84 NON-LINEAR REGRESSION CURVES 

We introduce the inoments, V- , of the distribution. Taking 

these about any point we have 

If we observe that 

(11) ^ /c-x? g(x) = Z ^ 

it is immediately seen from (9) that the coefficients can be 
expressed as linear functions of the ‘‘mixed'^ moments , 

1 / J tip to , ail other quantities being dependent on the marginal 
moments of x alone. 

This solution may shortly be summed up. For a fuller discus- 
sion I refer to the cited memoir by the writer. 

We write 


(12) t^Xx) = x' X 


"/ L-Z 

+ e, . X +■ 

. «. t-2 


XI 1^0 


Let 

ments of x 


CO 


be the following determinant of the marginal mo- 


( 13 ) 


cn 

A 


and 


a) 

^hL 
(h-hi) th, 

(-•r' . 




Ko 


u 

IJ 



xo 

ho 

t;' 

h! 

1 

zo 

30 


t 

U 

U 

/ 

hH^O 


%h^o 


be its sub-determinant obtained by cutting out the 
row and the t pt column and multiplying by 

Then we have 


(14) 


e. . = 


a) 

*^Ll 


and 




(i-/) 

A 


£ 


u., + e.. . V. +• 


' , f 

'•■ri-e, 1^,. 


i/ ' // ' 


€0 



WALTER ANDERSSON 


85 


or, wsing- the “standardized*' variables 


( 16 ) 4 , 






(m =meaA/^ 

and introducing the coeflficients 


7 

(f ■=z dispersion) 


(17) Q ^ - yt e, 

where £. / stands for the “standardized" moments and a is the 
usual Galton coefficient of correlation, we have / W. Andersson, 
Op. cit., p, 16 / 


(i-f) 


( 18 ) 




A 




L 3lo ^ • f - $so} 


The relations between the successive or the semi-invariant 
regression coefficients and the coefficients of the graduation 
parabolas as written in their usual forms are easily obtained. 
Taking the parabola of the ^ order 


(19) 



Afe) 


C^) i 



Ci^) 





a, 

X. 

^ i 

cl/^ X 

- 



^ , 


we have, indeed 

,/Op. 

cit., 

P- 17/ 








Cfa-} 












= % 

t- 


% 




... t 





=■ 



% 


% 

+• 

• - • i- 



(20) 


- 







* - " ^ 




C-h) 

* ' — 

— 

— 


- * 

■“ — 



— ■— 



% 

=: 










The mefficients are the same as those defined by (14). 

2. Starting with the .general solution Just ^indicated we 'inay; 



86 


NON-LINEAR REGRESSION CURVES 


proceed further into the matter. It will be seen that some new 
problems are met with in applying the general method to actual 
statistics. 

Taking account of the fact that the solution only involves the 
moments of the distribution, we can free ourselves from any as- 
sumptions as regards the distribution function itself. The required 
moment values may then be directly computed from the observed 
frequencies. This way of solving the problem leads to the method 
advanced by Pearson in his treatises on this subject. The solution 
evidently gives a least squares graduation to the observed array 
means when the weights of each mean value are proportional to 
the observed frequencies in the corresponding arrays. 

This method may be the most straight-forward one, but it is, 
however, by no means the simplest, nor the most efficient one. 
Considering the fact that the term of the i order of the parab- 
ola contains moments up to the 2i order, we immediately 
conclude that the arithmetical work would rise to a considerable 
amount, and, with growing moment order be more and more in 
vain, as a consequence of the rapidly increasing sampling errors 
of the computed moment values. Some other ways to treat the 
problem must be sought for in order to eliminate these difficulties. 

A first outline of a new method was 'suggested by S. B.' Wick- 
sell in the year 1930 /Wicksell, Op. cit. /. Starting with the.'gein- 
eral solution Wicksell pointed out that some well-known rules of 
Thiele as^ regards .the determination' of moments of high orders 
were directly applicable in the computation of high order regression 
parabolas. The rules of Thiele referred to may be formulated in 
the following ^yay / Thiele, Op. cit., p. 24 / : 

, To obtain the first semUnvariants, of', memnis, rdy entirely, 
on computations, ' To obtain the iniertwediate semi^ifwatid^s rely 
^ partly on computations,' partly- on: ikeoretk^^ considfr^om. 'But 
fo phffimihe higher Mmi-lnmriantsrely entirety on theoretical con^ 
sideraHons, .. ^ ■ 



WALTER ANDERSSON 


87 


Professor Wicksell's suggestion was that instead of the higher 
marginal niomeiits, involved in the least squares expressions for 
the regression coefficients, should be inserted the moments of a 
suitably chosen frequency function (with a limited number of 
parameters), fitted to the marginal distribution of the independent 
variate. 

The method indicated was then more thoroughly studied by 
the writer of these lines / Op. cit. /. The solution obtained along 
these lines was in detail worked out, and it was also tested as re- 
gards its practical use fullness in dealing with actual statistics. 
Especially by use of the Pearson types of frequency functions 
very simple expressions, successfully applicable within a large 
domain of actual statistics, were deduced. 

An important advantage of this method / as well as of the 
Pearson method / of v computing high order regression parabolas 
may be noticed. As the regression coefficients have been expressed 
as functions of the moments only — in the method elaborated by 
the author only of those of low orders — the influence of ^‘group- 
ing"' may be accounted for by correcting the computed moment 
values in this respect. For this purpose suitable correction formu- 
las are available, as for instance the well-known ones given by 
Sheppard. Experience has convinced me that at the ends of the 
regression curves, at least, the effect of grouping can displace the 
computed curve in a considerable manner, so that in many cases 
some attention must be paid to these circumstances. 

It is, however, to be remembered that the solution obtained 
by applying WickselFs proposition does not give a strict least 
squares graduation to the observed array means, as a consequence 
of the fact that the theoretical values of the high order moments 
always in some degree differ from the directly computed ones. 
Prom this it is evident that some care must he taken in choosing 
the hypothesis as regards the marginal distribution of the inde- 
pendent variate.,, It may' be, remarked,, however, that these circum- 



88 


NON-LINEAR REGRESSION CURVES 


stances cause very little practical difficulty on account of the much 
refined theory of uni-variate distributions. 

The discrepancy between a least squares solution and the 
solution as obtained by applying the method as advanced by the 
author may, as pointed out to me by Professor Wicksell in the 
course of the official ventilation of my thesis, be removed by an 
adjustment by which the latter solution is turned into a strict 
least squares solution. This problem will be considered in the 
following paragraph, and at the same time we shall get an oppor- 
tunity to study the hypothetical assumptions applied before from 
a somewhat different point of view. 

3. We consider the expression (8). Before we have from 
this condition worked out the general least squares solution in 
assuming J^Cx) to be the true marginal distribution (5) and 
to be the true regression function and then 1 / the directly com- 
puted moment values were inserted in the general solution / Pear- 
son’s method /, or 2 / the moment values required were deter- 
mined in . accordance with the rules indicated in § 2 / method 
elaborated by the writer /. Now we shall directly imply in (8) 
our working hypothesis concerning the marginal distribution of 
X • Let the hypothetical ;r-marginai distribution function be a/(x). 
The solution is then to be deduced from the following condition 

/ _ ^ 3- 

( 21 ) ^ a>(xy [^(x) - (/i (x.) ^x) - • • - %(x)] = /V. 

It is immediately dear that, in this way, we always get a 
strictleast squares solution with respect to the distribution func- 
tion uj£x) whatever the form of ^Cx) may be. 

In fact,' the functions /Cx) and fCx.) are totally independent 
of one another, and, as is seen from (8), the distribution func- 
tion /txj enters into the expansion of Tchebycheff for the re- 
gression 'function only 'as a weight function which deter- 

mines the,; weights to 'be allotted to, the 'regression means riii; grad- 



WALTER ANDERSSON 


S9 


uating the values of these by means of this series carried to a 
certain order, or, what is the same, by means of a parabola of the 
same order, the coefficients of which are determined according 
to the principle of least squares. Then it is clear that for practical 
purposes it is not necessary to derive the exact form of ^(yi) in 
performing the expansion. (6). The hypothetical distribution 
function u^cx) would be expected to give a satisfying result as 
soon as a>(x) in its main characteristics corresponds with the true 
distribution function /dx). 

I am going to work out the detailed solution for the follow- 
ing'two usual forms of uj(x): 

A/ Normal Error Function, ^ 


UJ 






( 22 ) 






B/ Pearson Type III Function, 


/3-I -“^4 


UJ 


c ■ e 


( 23 ) 


A 


X- rv, 




5 


/ ■3 - 


(24) 


5 is the skenmessy or 

5 = - - 


-30 


In both cases the expressions for the terms of the series of 
Tchebycheff will be found to be very simple. 

At first considering the polynomials ( x) w^e are to have 
in accordance with (7), the distribution function being continuous, 

oo 

( 25 ) f dx - . 



90 


NON-LINEAR REGRESSION CURVES 


From this expression it may be conduded that the polyno™ 
miais f-Cx) are in case A/ the polynomials of Hermite, and in 
case B/ those of Lagtierre. Both these kinds of polynomials are 
of well-known forms, and consequently the values of the ^-coeffi- 
cients as defined by (12) may easily be derived from propositions 
about these polynomials. 

For the successive coefficients we have according to (9) the 
following expression 

dx- Uj(x) f(x-) 


X- cv(x)- 

Taking account of (13), (14), and (15) and introducing the 
notation 

(27) = /i^x - ojCx;- x'^-^Cx) , 


/ 





we' obtain 

(28) 


A 


.(--0 




A 


\)j. +e.. u -t- e.P -te. VI 

L <*.f iff 4-.// Ll N LC C>/J 


or, introdudng the corresponding ‘‘standardized*^ moments 
and putting ^ 


.'(29) 
we get 


5. 


=- &, 
Lf 


u Ltt^e 


ji-,} 


The coefficients and are detenmned by (14) and 

(15) when the moiiienb -values 






92 NON-LINEAR REGRESSION CURVES 

In order to -derive the expressions for the computation of the 
moment quantities , or , we denote the class-breadth by ^ 
and the observed mean value of y in the ^ array of ^ ^>y ‘ 

The values of are then given by the following formula 


(35) 

It 

2- ■ 
-fi' V 


where 


Z 

r 


(36) 

i« -- 

V — ^ 

V "a 

X cu 


The computation is easily performed as soon as the function 

X 

(37) Q Cic) - J cL X LiiC^y 

^oo 

is known. In either case we have access to suitable tables of this 
function. For the Pearson type III function the “Tables for the 
incomplete F -function, edited by Karl Pearson^' are to be used. 

4. We will now make some general remarks concerning the 
relations between the different methods of computing regression 
parabolas touched upon in the preceding lines. We start with the 
general condition (8) for the determination of the coefficients: 

M,w. 

It is seen that the expansion is determined by the marginal 
distribution function and the regression function If 

jf(x) and are not the true functions of the population but 
the functions corresponding to the actual sample, the solution 
will give the sampling values of the coefficients. This is the solu- 
tion advanced by Pearson, and consequently in his method no 
graduation of the data is performed in order to smooth out the 



WALTER ANDERSSON 


93 


iniuence of sampling irreguiarities on the values of the coefli- 
cients» Without any further considerations it is clear that meth- 
ods which include an adjustment of the data in this respect are 
desirable. The problem is analogous with that occurring in the 
general theory of distributions. Among other facts of great im- 
portance that speak in favour of using mathematical functions 
for the description of distributions one is that we in this way are 
able to eliminate in some degree the accidental irregularities. 
When the regression is described by the series of Tchebycheff the 
smoothing process is evidently performed firstly by graduating 
the regression means by a parabola, and secondly by adjusting 
the parabola coefficients for the accidental irregularities. This 
latter adjustment has been accounted for by the two methods 
treated by the author. When using the rules of § 2 as principle 
for this adjustment the smoothing process is applied to the mo- 
ment values involved in the general solution for the coefficients, 
and in the methods indicated in the preceding paragraphs we have 
used a weight function which is to be considered as a graduation 
of the observed marginal distribution of the independent variate. 

As mentioned before we do not get a strict least squares solu- 
tion when applying the rules of § 2. This is, however, of little 
practical importance, but it remains to see in what manner this 
solution is to be modified in order to become a least squares grad- 
uation of the observed array means. 

When applying the rules of § 2 the product moments are 
computed from the following expression 


( 38 ) 


/ , I 

% ' N 




h. — 


where /¥ is the total number of observations and 72^ the number 

of obserrations in the array of x . We suppose that the 

graduation is to be based on directly computed momeat values 



94 


NON-LINBAR REGRESSION CURVES 


Up to the order, /t- usually not being greater than six, in 

accordance with the rules of Thiele. The values of the marginal 

moments of the independent variate up to the h> order indicate 
the distribution function which function is chosen 

as the theoretical distribution function determining the values of 
the marginal moments of orders above the . A strict least 
squares solution with respect to the distribution function 
may be worked out according to the formulas given in this mem-- 
oir by taking ^ 
product moments the following values 


(39) 

1 

^ A 

where 


f- z 

(40) 

I = 

f ^ 

nr 






V ' 


V -2 

Subtracting (39) from (40) we get 


( 41 ) 



which consequently are the corrections to be added to the directly 
computed values of the product moments of the solution worked 
out in accordance with the rules of § 2, in order to obtain a strict 
least squares solution. 

These corrections are easily computed as soon as the integrals 
are determined. This task, however, would in some cases be 
somewhat arduous. If the general Pearson theory of frequency 
is applied we must sometimes resort to mechanical quadrature 
formulas. . 



WALTER ANDERSSON 


95 


a remark conceroing the correction of the grouping of the mo- 
ments , According to the method of computing these char- 
acteristics we may regard them as mixed moments of a distribu- 
tion having as its jr-margiiial distribution the function uj{k), the 
regression means being the observed ones. Thus we evidently 
can apply the usual methods of correcting computed moment val- 
ues for the effect of grouping. By using the formulas of Shep- 
pard we have to observe, however, that the moments involved in 
these formulas must be referred to the supposed semi-theoretical 
distribution, 

5. Numerical Illustrations, In order to illustrate the appli- 
cation to observed data of the consideration above I have numeri- 
cally treated a few populations — representative ones in that they 
are examples of correlation distributions of different degrees of 
skewness. 

Example L Case of slightly skew correlation. Pearson’s 
example B. Example 1:2 and 11:2 of the cited memoir of the 
author. Population ‘.Correlation between age and height of head 
in 2272 girls. 

/X = age ; ^ = height of head / 
x= X-X^=+.Z007 

71 = - 189 ^ 

As regards the monient values I refer to the memoir of Pear- 
son. These indicate that the marginal distribution of X. may ap- 
proximately be represented by the normal curve. Thus I take for 
aJjt the normal function. 

We have to calculate the product moments , and . 

These computations may be performed by using the follow- 
ing scheme. The different values are derived from the correlation 
table given in Pearson’s memoir. 

The values of ^ correspond to the class ranges. 



% NON-LINEAR REGRESSION CURVES 


i 

X 


N 




- 

N 

- M ... 

X 


,3 _/ 

-9 




.0015 


.0011 

45.000 

“ 


3645.000 


« 4.143 


2.831 

.0037 


.0006 

33.144 

- 

265.152 

2121.216 

- 7 

- 3.S89 


- 2.513 

.0084 


.0005 

27.223 

- 

190.561 

1333.927 


- 3 . 07 S 

.0176 

- 2.186 

.0170 

- 

.0006 

18.450 



664.200 

-S 

- 2.474 

.0335 

- 1.860 

.0311 

- 

.0024 

12.370 

- 

61.850 

309.250 

- 4 


.0550 

- 1.534 

.0511 

- 

.0039 

7.232 

- 

28.928 

115.712 

-3 

- 1.763 

.0779 

- 1.208 

.0756 

- 

.0023 

5.289 

- 

15.867 

47.601 

^2 

- 1.217 

.1034 

- 0,881 

.1003 

- 

.0031 

2.434 

- 

4.868 

9.736 

- 1 

- 1.054 

.1149 

- 0 .SSS 

.1199 


.0050 

1.054 

- 

1.054 

1.054 

0 


.1360 

- 0.229 

.12961 

- 

.0064 

0.000 

-- 

0.000 

0.000 

1 

- 0.194 

.1158 

0.098 

.1238 


.0080 

- 0.194 

- 

0.194 

- 0.194 

2 

0.232 

.0871 

0.424 

.1106 


1 

.0235 

0.464 


0.928 

1.856 

'3 

0.453 

.0942 

0.750 

.0858 

- 

.0084 

1.359 


4.077 

12.231 

4 

0.642 

.0713 

1.077 

.0605 

- 

.0108 

2.568 


10.272 

41.088 

5 

0,832 

.0418 

1.403 

.0384 

- 

,0034 

4.160 


20.800 

104.000 

6 

0 . 8 S 5 

.0268 

1.729 

.0218 


.0050 

5.310 


31.860 

191.160 

7 

2.154 

.0057 

2.055 

. 0115 ^ 


.0058 

15.078 


105 . 546 ! 

738.822 

8 

- 0.714 

.0031 

2.382 

.0052 


.0021 

- 5.712 

-» 

45.696 

- 365.568 

9 

0.625 

.0035 

2.708 

.0022 

- 

.0013 

5.625 


50.625 

455,625 

10 

IIIBSi 


3.034 

.0008 

- 

.0001 

OSXXi 


0.000 

0,000 










WALTER ANDERSSON 


97 


We get 




2.9879, 




-6.6564, 


Z Ix 


75.5197 


and from these values 

IT rr 3.1 123, M =: -2,2376, = 7921 10. 

Sheppard’s corrections for grouping have been applied. 

For the corresponding standardized moments we obtain the 
following values: 

=: 0.2941, ^ ^ zr: -0.0689, e= 0.8000. 

This leads to the following values of the g -coefficients de- 
6ned by (29) : 

^^=-0.0689, ^=-4).0823. 

The values of the successive regression coefficients then be- 
come 

^ = 0.2941, ^4.= -4).0345, % = -0.0127. 

Comparing these different values with the uiicorrected ones 
we find 


e = 0.0000 

il 

i.. - =-0.00S6 

= + 0.0511 
■£j£i — ^30 — 0.0365 

4. - ^.0 = + 0.2894 


5 -,.- 


= + 

0.0021 

■ 


= — 

0.0337 

% - 

- < v -, 

= + 

0.0000 

% 

- “C 

= + 

0.0010 

^3 ■ 

- "^3 

= — 

0.0056 


We especially observe that the adjustments of the ^ -coeffi- 
cients are smaller than those of the moments of the same orders. 

The adjusted coefficients result in the following regression 
parabola of the third order: 


— 4 . 0.0345 + 0.3352 ^ — 0.0345 4""— 0.01374^. 

The curve Is drawn mi diagram 1. For the sake of compari- 
son the graph of the Pearson curve and that obtained by applying 
the rules of Thiele, the marginal being the normal curve, are given 
on the same diagram. 



98 


NON-LINEAR REGRESSION CURVES 


Example IL Case of moderately skew correlation. Popula- 
tion .‘Correlation between weight of newborn boy and weight of 
placenta; material supplied by the Maternity Hospital of Ltmd, 
Sweden. Example 2 in S. D. Wicksell : “Correlation Function of 
Type A, etc.'^ /Kungl. Svenska Vetenskapsakademiens handlingar, 
B'd 58, Nr. 3/. N 1223. 

/x = weight of boy; ^ = weight of placenta/. 

The correlation table and the computed moment values are 
given in the said memoir of Wicksell. For oa ^ we take the nor- 
mal function. 

Calculating the moments 17 , etc. in the same manner as used 
in the first example we obtain the values, 

S = 1.5540, iJ. =: 0.3412, 17 = 11.7653 

which give the following values of the standardized moments: 

S =0.6420, £= 0 . 0837 , £,,= 1.7153. 

The values are corrected for grouping. 

We further get 

fso= + 0.0837. f^^=- 0.2106 

and 

^ + 0,6420, ^ + 0.0419, % = — 0.0351. 

The values of the adjustments of the different quantities are 
given below: 



al 

= — 0.0035 


-%o 

= -0.0656 



= + 0.0196 


~ 3‘^o 

= — 0.0173 



= - 0.2132 


- °r, 

= -0.0035 


^30 

= + 0.1320 


- *^2. 

= — 0.0328 


^40 

= — 0.2870 

^3 

" “^3 

= — 0.0031. 


The correction of ' ^ is rather great, but not greater than was 
to 'be 'expected' with consideration to the roughness of the fit of the 
' hypothetical , marginal' distribution, function. It is clear that when 
applying/ the solution . 'Of' my, previous paper in this example we 



WALTER ANDERSSON 


99 


should use a type IV curve for the marginal distribution. The 
unadjusted values of the parabola coefficients are also in this 
case easily computed, but the calculation of the adjustments by 
which the solution is turned into a least squares solution would 
be very laborious. 

In order to illustrate the suitability of the several methods I 
have drawn the following curves on diagram 2: 1/ unadjusted 
solution, h 3 qx)thetical marginal distribution being the normal 
function ; 2/ adjusted solution, hypothetical marginal distribution 
being the normal function ; 3/ unadjusted solution, marginal dis- 
tribution being Pearson's type IV^ function. 

The equation of the second curve is 

rrr — 0.0419 + 0.7473 4 + 0.0419 4^— 0.0351 . 

The third curve is undoubtedly best fitted to the data. 

Example IIL Case of extremely skew correlation. The cor- 
relation between the age of bachelor and the age of spinster at 
marriage, Sweden 1911-1920. Example 1:4 and 11:7 in the cited 
memoir of the author. N = 321908. 

/x =1 age of spinster ; ^ = age of bachelor/ 

27.5j,;ts- -.3/3/ % = .^ 5/5 X.' f- 

The moment values as given in the said memoir indicate that 
we can use Pearson's type III function as hypothetical distribu- 
tion function for the .r-marginal distribution. From the moment 
valuo as computed in the cited memoir we obtain the following 
values of the constants of this function: 

=1.4312 /3= 2.0483, 

It is to be remarked that for our purposes the computation 


of the constant C is not' necessary. 

For the r-coefficients we get the follown'ng values : 


e^, = — 1.3974 

€ 3 ^ = — 4.1922 

= — 8.3844 

^^^=- 1.0000 

€ 3 , = — 0.0708 

■e,i= + 11.5752 

€ 3 ^ = + 2.7948 

= + 11.3771 
e,^= - 5.7876. 



100 


NON-LINEAR REGRESSION CURVES 


For the unadjusted values of the ^ -coefficients and the suc- 
cessive regression coefficients we further get 

5>3^ = + 0.1787 + 0.5255 5>5o = + 2.8763 

+ 0*5535 %. = + 0.05192 

0.0122669 + 0 . 003097 . 

Computing the corresponding adjusted values by use of 
**Tables of the incomplete /^"-function’’ we obtain 

= + 0.1723 = + 0.4638 %o - + 2.0851 

^ + 0.5528 ^ + 0.05789 

^ „ 0.014647 % =: + 0.001097 

Sheppard's corrections have been applied in both cases. The 
differences between the adjusted and the unadjusted values are 


- 


= — 0.0007 


£3^= 0.0000 

f.,' 


= — 0.0034 

£.0 - 

£^o = —0.4762 

^3/ “ 

S3, 

= —0.3237 

- 

£5^ = — 2.0405 



= — 1.4548 

0^ — 

1 

•v, = — 0.0007 

%- 


= — 0.0064 

% c - 

= — 0.0617 


Sso 

= — 0.7912 

%. - 

=4-0.00597 




*S ^ 

"Yj =—0.001978 




% ' 

% = - 0.002000. 


The parabolas of the third and the fourth orders are the 
following ones: 

Unadjusted values of the coefficients: 

= -.0873 ’t ,ai2G7^ 

“'^^53 -f- . 5 / 7 /^ i- .0030^7 ^ , 

Adjusted values of the coefficients : 

^ --,0988 f- -t ^ 

The graphs are drawn on diagrams 3 and 4. 

The results indicated by the few examples treated in this 
par^^raph clearly point out that the Tchebycheff expansion can- 
npt be considered as a lea$t squares graduation of the observed 



WALTER ANDERSSON 


101 


regression means when the moment values involved in the solu- 
tion are determined in accordance with the rules laid out in § 2. 
As regards the practical applicability of such a solution, how- 
ever, this circumstance is of little importance, because the curve 
ill this case is found to give as good, and sometimes a better rep- 
resentation of the regression than a strict least squares graduation. 
Further, as the calculation of the moments of the first tew orders 
is often required for' other purposes than the determination of 
the regression curve, the computation of the unadjusted solution 
in these cases is arithmetically very simple. Not having access to 
the moment values we may perhaps in some cases consider the 
(iirect computation of the adjusted solution as performed in ex- 
ample I to be the simplest method. The adjusting of correctly 
determined unadjusted solutions would certainly very seldom be 
of real gain. 

Stockholm, September 1933. 


S. D. WiCKSELL 
Note on Dr, Andersson^s Paper, 

In an extensive memoir. Researches into the theory of Re- 
gression, Dr. W. Andersson has worked out a very simple and 
widely applicable numerical method of “computing curved regres- 
sions. The general principle on which this method was founded 
Dr. Andersson has kindly attributed to me. It was laid out in 
my paper in the first number of the “Annals'' Journal and may 
be stated as follows : After’ fitting a suitable univariate frequency 
function with a limited number of parameters — e.g. the normal 
curve or one of Pearson's types — ^to the marginal distribution of 
the independent variate, the moments of this function — which are 
all expressible in terms of the parameters- — should be used ' in 
computing the regression coefficients, instead of the ordinary 



102 


NON-^IJNEAR REGRESSION CURFES 


values (}>ower means). Of course, when, in fitting the curve, the 
ordinary moments of lower orders have l>eeii used in detemiiiiirig 
the parameters, this procedure means that the moments of higher 
orders are theoretically expressed in terms of tiie moments of 
lower orders instead of being directly compiitech 

Applying this device to the ordinary least squares expres- 
sions for the regression coefficients, it was clear tliat a departure 
from the least square condition took place, but the chances were 
that this would not harm the result, and the computations would 
be much simplified. Dr. Andersson's investigation has shown 
that these expectations were highly justified. 

During the official ventilation of the memoir, which was pre- 
sented as Thesis for the degree of D.Ph., it was agreed that 
the method ought to be tested by a comparison with a theoretically 
very similar method in which the least squares condition was re- 
tained, although theoretical or semi-empirical weights were intro- 
duced instead of the purely empirical weights used in the method 
of Karl Pearson. 

In the present paper Dr. Andersson has taken this question 
up and he shows that whereas the original (unadjtisted) method 
is numerically simpler in application, it gives practically just as 
good regression curves as the new, adjusted method. In some 
cases he even considers the unadjusted solution to be the better one. 

By this the incident may seem to be closed. I shotild, how- 
ever, like to point out, in a few words, how very straightforward 
a principle it is, which lies behind this adjusted method. 

It is simply this : ' When a correlation table is given, the re- 
gression of y on X will not be affected by multiplying the 'frequen- 
cies within 'any x:array by a constant factor. Hence the follow- 
ing procedure will not affect 'the regression of y on x; t.e.. 'the 
process' of reducing or ^adjusting the frequencies in the several 
.r'rarrays so that, the marginal sums will be equal to the smoothed 
.frequencies, corresponding to any mathematical curve which, has 



WALTER ANDERSSON 


103 


been fitted to the marginal distribution. Thus, on applying Pear- 
son's ordinary least squares solution to this adjusted tal)le a least 
squares regression parabola would be obtained in which the mar- 
ginal moments where those of the smoothed distribution, and also 
the mixed moments were, although only in a secondary degree, 
affected by the smoothing of the marginal, it is only in this last 
respect, i.e. as regards the mixed moments, that this method devi- 
ates from the one originally proposed. 

in my opinion many curved regressions could he very easily 
and accurately enough computed by simply smoothing the mar- 
ginal of the independent variate with a normal curve or, event- 
ually, a Pearson Type Hi curve, and correspomlingly adjusting 
the array frequencies. This method may work well even if the 
deviations of the actual distribution from the smoothed distribu- 
tion are systematical 

Statistical Institute, University of Lund, November 1933. 



104 


NON -LI HEAR REGRESSION CURVES 


DiAiGKAM i 

Ctibtcs 



«; (T^ <r^ 

H ^ 5 fO tx i'# 18 AZ 


Umbrokeix curve: Adjusted solution, tlie hypothetical marginal <li$« 
tribution being the normal curve. 

Dotted curve: Unadjusted solution, the hypothetical marginal dis- 
tribution being the normal curve. 

Dashed curve: Pearson^s curve. 


WALTER ANDERSSON 


105 


Diagram 2 
Cubics 



- 3 ^ 777 , 0; 

A 7 u sz 5.5 A 8 v.y ^,7 


Utibroken carve:' Adjusted solution, the hypothetical marginal dis- 
tribution 1>eing the normal curve* 

Dotted curve; Unadjusted solution, the hypothetical marginal distri- 
bution being the normal curve. ' 

Dashed curve: Unadjusted solution, the hypothetical marginal distri- 
bution being the Pearson Type IV curve. 




106 


NON-LINBAR REGRESSION CURVES 


70 

uo 

so 

so 

xo 


Diagram 3 
Cuhics 



Diagram 4 



' 30 40 so. &0 ^^ 5 . 

Unbroken ctarvesr Unadjusted . solutions. Daslied curves: Adjust«»d 
solutions/ tbe ,fiyi)Otbetical marginal distribution, being ^ the Pearson Sfpt 
lit curve. ' , - ' 


THE STANDARD ERROR OF ANY ANALYTIC 
FUNCTION OF A SET OF PARAMETERS EVAL- 
UATED BY THE METHOD OF LEAST SQUARES 

By 

Walter A. Hendricks, 

Bureau of Animal Industry, 

U. S. Department of Agriculture, Washington, D. C. 

After fitting a curve to a set of data by the method of least 
squares, it is occasionally necessary to use the resv>Iting values of 
some or all of the parameters of the curve in further calculations. 
Since the estimates of the values of the parameters obtained from 
a particular set of data are subject to errors of sampling, it fol- 
lows that the result of any calculation involving those values of 
the parameters will have a certain standard error. Since the 
estimated values of the parameters are not independent of each 
other, the familiar formulas based on the assumption of inde- 
pendence should not be used for the purpose of calculating this 
standard error from the standard errors of the parameters them- 
selves. The correct approach to the problem involves little more 
than an application of the methods presented by Schultz (1930) 
in his excellent paper describing the method of calculating the 
standard error of a particular function of the parameters, viz.* 
the same function which was used in evaluating the parameters. 

Let be an analytic function in- 

volving the K parameters, Xi • This function may not be linear 
with respect to the parameters, so that if the parameters are to be 
evaluated by the method of least squares, we have in the general 
case a function of the form : 



108 


THE STANDAFW ERROR OP PARAMETERS 


from which the values of the parameters may be obtained by as* 
siaming approximate values and calculating the corrections which 
must be added to obtain the most probable values. 

After the values of the parameters have been ol)taiiie(h let 
it be required to find the standard error of a new fiiiictiorij 
involving those values. If 2- is an analytic 
function of the parameters, we have to a close approximation : 


(2) z = 







Any error in£, beyond the insignificant error introduced by the 
above expansion, will then be due only to errors in A ^ 

, 

;» hi * 


Therefore, if 


^ A tf ' 


A A -f 


c> ^ 




(3) ^ 

and 5^ and 5^ denote the standard errors of andj^, respeo 
tively, it is at once apparent that • 

The values of may be expressed in 

terms of the data from which they were evaluated^ 


( 4 ) 


aX ^ = a; M, 

P 


■ • + c 

A. 

■ > 

II 

A 

P 


• ■ + 1; 



P 



“Ma. 


variable, ^ may then be expressed in the form: 

(5) 


30 






30 


<A]. 


; From' the, well-known laws of propagation of 'error and the 
fact that 5 , it follows that 

au 


itt iyhiCtli is the standard error of estimate of based on k 



WALTER A. HENDRICKS 


109 


degrees of freedom. If the right-handi member of equation (6) 
is expanded, the equation may be written in the form : 


{7) 


\dX, 

-h Z 






cr^ ^ (Q -f- 


-*■ etc. 


/ a & 

in which 

The values of the sums of the squares and products multi- 
plying the differential coefficients in equation (7) may be obtained 

from the normal equations formed for the evaluation of aX,,aX , 

^ ■ f z 

. - . . Let the normal equations be: 

fa.«.3 aA, -f £a^J +-£«^JaA^= £a./7j 

-h £6- ■fr J A = £^^J 


( 8 ) 


[^a.] aA., + — +-£(?(?JaA^= [e/ij 


If these equations are sr.lved for '> X, ^ a ^ ^ a X ^ by the 

method of undetermined multir f “s, the first is multiplied by an 
undetermined constant, ofj , the . id by tv^, etc., and the resulting 
products are added. The cone for the solution for A A, are: 
£«.a3 + fa : ^ lAtQ = / 

(9) ’ ‘T', -t Cd ' .q, -f- • ■ ' £Ai?J » O 

■ Ua] -r, f C^-frJ \ +-••' + ~ O . 

To solve for A A,, equations (8) are multiplied by , ^5 , ••• /9» 
respectively, added, and the following conditions imposed: 

£<xa.]/3, + £a^3 4, + ... ^ lafij = o- 

(10) + £^^^3 A. + ■ ' ■ + ^ 

1^0.34 f "+ •" + = A . 

To solve for equations (8) are multiplied by 

wJ„. , respectively, added, and the following conditions imposed: 



110 

THE STANDARD ERROR OE 

I’ARAMETERS 


fetaj to, +• to^ + . . • -f 

iaPj o 

(11) 

C&a-l -t [d-d-J lOj +, . • + 

£ Ad./ K o 


£,<)«.] • -f- 

Cddj t ., . = / 

It Eiay be proved that : 



'n' - f tro j /ff = [■f/j ] 


(12) 







The method of deriving equations (12) is indicated in the well- 
known text on the method of least squares by Merriman (1907) 
in which a detailed proof of the fact that is equal to is 

presented. The other relations may be derived in analogous fash- 
ion. It may be observed that [cttJ “ ffrj , etc. 

The required quantities to be substituted in equation (7) 
may, therefore, be calculated by solving the sets of simultaneous 
equations, (9), (10), and (11). 

This completes the solution of the general problem presented 
in the first part of this paper. Some confusion may arise in re- 
gard to the proper application of the methods described above if 
one or both of the function.s, ^ and Z, happens to he in a linear 
form with respect to the isarameters. It may be shown that tlie 
formulas given will hold in any of these special cases- Although 
Taylor’s theorem may be applied to such functions, such a treat- 
ment is .superfluous. If either or both of the functions, y and t, 
is linear with respect to the parameters, the expression for 
is identical with equation (7) even though the linear function, or 
functions, \vas not first expanded by Taylor’s theorem. Further- 
more, if y is linear with respect to the parameters, the values of 
the coefficients, jjTa'J , etc., in equation ( 7 ) will be the same re- 
gardless of whether the parameters weye evaluated directly or 
whether ^ was fir.st expanded. The latter statepent is evident from 



WALTER A. HENDRICKS HI 

an inspection of equations (9), (10), an4 (11) and a considera- 
tion of the law of formation of normal equations. 

As an example of the application of the methods presented 
in this paper to a specific problem, consider a set of data given 
by Spillman (1933) relating to the yields of potatoes obtained 
from four plots of ground which had been treated with different 
amounts of potash. 

Yields of Potatoes From Four Plots of Ground Receiving 
Different Amounts of Potash 


;r (Units of K^O) 

y (Btishels of potatoes) 

0 

91 

1 

2SI 

2 

331 

3 

1 381 


When a simple exponential equation of the form, 

(13) ^ = A 

was fitted to this set of data, the most probable values of the 
parameters, 4,0, and k , were found to be as follows : 

A =432.801 It 1L637 
3 =341.393*11.406 
K = 0.6195918 * 0.0462871 . 

The value of the product of the parameters, A and K, hap- 
pens to be of some interest, at least to the author of this paper, 
since 'it gives the value of the first derivative of y. with respect 
to X at the point where the curve crosses the jr-axis. In the pres- 
ent example it represents the increase in^ yield, per unit increase 
in amount of potash applied, which would be possible^ if certain 
Inhibiting influences,, which seem to be proportional to the yield, 
had no eff'Cct- 'For the particular data under consideration, the 
value' of this, product is 268.160. 



112 


THE STANDARD ERROR OE PARAMETERS 


In order to calculate the standard error of this value, equa- 
tion (7) was applied as follows; 

(10 -< L<ri,]) 5 ^ , 

from which the standard error of Ak was found to be equal to 
± 13 . 331 - 

The familiar formula for calculating the standard error of 
the product of two independent quantities, when employed for 
the purpose of calculating the standard error of A k be writ- 
ten in the form : 

(IS) 5^- 

Equation (15) gives a value of ±: 21 291 for the standard error 
of Ak, which deviates considerably from the correct value given 
by equation (14). The discre|)ancy is due entirely to the fact that 
the estimated values of A and k are not indei>eiKUiit. 


REFERENCF.S 

Merriman, Manshkui, 1907. Tlie McIIkk! nf Lvast S<(u;m‘s. Jolni 
Wiley & Sows, New Yi^rk. 

Schultz. Henry. 1 930. '"('he .stanOanl nrrnr nf a frotm a, curve. 

Jour, Amer. Stat. AsHec., 25 (N. H. 17) : 139^185. 

Spillman. W. J, 1933, of th<* cxivaaiitial yielO ruj'v'c in, ft'i‘liliz<T 
experiments. U. S. I). A. Tech. IhUI. 348. 



TRANSFORMATION OF NON-NORMAL FRE- 
QUENCY DISTRIBUTIONS INTO NORMAL 
DISTRIBUTIONS* 


By 

G. A* Baker 

This investigation is undertaken for two reasons: (1) there 
has been a demand on the part of some statisticians for an analytic 
method of transforming non'-normal distributions into normal 
distributions; and, (2) a non-normal distribution and the trans- 
formation necessary to transform it into a normal distribution 
serve to specify the distributions in random samples of estimates 
of the parameters of the original distribution in terms of the 
distributions of estimates of the parameters of a normal distribu- 
tion in random samples. In this way valuable approximations to 
the distriimtions of the parameters of the original non-normal 
population may be secured* 


PART / 

TRAKSFORM ATIONS ' OF FREQUENCY 
DISTRIBUTIONS 

Consider a non-normal frequency distribution represented by 
/<“x} d % where the origin is taken at some central point, say the 
mode, mean, or median, or near one of these points, and the scale 
is, or approximately is. the standard deviation of the distribution. 
We seek a' function, y , such that /cx.)cLx transformed by the 
transformation^ ' ^ ^ 

becomes a normal distribution of total area fSr , mead zero and 
standard deviation unity, i.e* ■■ 

. * Presented at 'the May, 1932 "tneetkig 'of the Illinois ^ section; of' 'the; 
Atnerican Mathematical Associafloti. ■ 



114 


NON-NORMAL FREQUENCY DISTRIBUTIONS 




d 




In a }3reviotis, paper ^ expressions similar to (1) were regarded as 
differential equations which can be solved exactly in certain special 
cases. In the case of (1) it seems preferable to regard 


( 2 ) 




u. 


as an identity in u. . If it is asstmied that f and ^ are functions 
that can he represented by Maclaurin s expansions, which is a 
reasonable assumption regarding ^ and ^ if / is near normal, 
the two members of (2) can be expanded and the coefficients of 
corresponding powers of tt can be equated thus determining ^ , 
Suppose that 


C4) a(^) -- 2 4, 

^ -TZ- 3- , 

^ Tfsf ?t 


Then (2) becomes oo il'-' oo 


IC 


4 4 

It u. 


w 


u» 




•f 


Hence 

A" T , A = 

' % y z ^ 6 ' ' ■* > 


^Transformations of Bimodal Distributions, Amats of Mathemaikat 
Vol. li No* 4, Nov. 1930. 



G. A. BAKER 


US 


5, - -A, ^ 

4 = %-A.(3X,*-R.BfiXApX^BBXAxX- 
4- 

-aXb’b^* + 2 
- A , xx,^bx:)-aXb^- ^ 


5 


A = - il- - A ( 4 4 ^ 4 4 4 " 4 4 4 ) 

- AjB,% i- B, ■<■ b;bX ^ bXx) 

- A, xX -i 4 Y)-^5 A4 - 

- -AXbX,^-^4 B, B,^) 

-aXbX^i-b, b^b^* b, 4 B^^-i 

- A/^ "4 . 3 4 ^ I « f eSX) 

- '>■, ( 4 Ai ^ A "4 4 ^ i 4’4’) 



NON-NORMAL FREQUENCY DISTRIBUTIONS 


The corresponding formulas for determining a function to 
transform a normal distribution of total area \fl7r, mean at zero 
and standard deviation of unity, into a given non-normal distri- 
bution are as follows. (The As are the coefficients in the expan- 

sioE of the given non-normal distribution and the Bs are the 
coefficients of the transforming function.) 

A, ^ = A , J5 = A ^ 4! , 


■®5 -■ i ^ t * i ^ 4 "- r. B/ 


= i A, i- 1 f eX 
= i 4 i ^ q *■ i e;-^ ^ f i? ^‘.^44 

4 = i 4 4 ^ i 4 - 4 . - i 4 4" 

+ 444 ^■ 444 -i- 4’4 


These formulas give very simple results for the expression 
of the first few terms of the transforming function, f , in terms 
of the coefficients of the given function. If the coefficients in the 
expansion of ^ rapidly approach zero so that only a few terms 
are needed for a good approximation the method outlined should 



(7. A. BAKER 


117 


be effective. Edgeworth- has discussed at some length the trans- 
foniiation or ^^translation” of normal distributions into non-nor- 
mal distributions and has given several methods of determining 
the coefficients of the transforming function. The formulas pre- 
sented here are more simple but their practicability can be demon- 
strated only by numerical results in special cases* For practical 
purposes the left-hand member of (6) need represent the right- 
hand member accurately only in the interval, say ^ U.4: Z . 


Illustration 
For example, consider 

^iCX ■ 

/cx) = . (i-h j^) e ^ 

which is skewed noticeably in the positive direction but which is 
of a type that approaches a normal distribution as the skewness 
approaches zero. Then 


>> 

1 ! 

.9929 


1.0072 

A, = 

- .1000 


.0511 

= 

-.4887 


.0050 

^3 = 

.0823 


- .0080 


.1142 

4 = 

.0004 

^3 = 

- .0279 




■ -Bowky, A. L,-F. Y, Edgeworth's Contributions to Mathematical 
Statistics, pp. 65-7S. 



118 


NON-NORMAL FREQUENCY DISTRIBUTIONS 
Table I 


Comparison of the ordinates of the normal function, function 
with skewness of .2, and the skewed function transformed by the 
transformation X = i.oojz u. -t-. 05 ! i f . oo^O u^-.oeSO u 


11 

Normal 

tnrvt* 

Function 
with Skew .2t 

Transformed 
skew curvet 

Norma! 
minus 
skew curve 

Normal minus 
transformed 
skew curve 

2,0 

.053991 

.049243 

.0576 

mmm 

.0036 

L8 

.078950 

.076810 

.0910 


..0120 

1.6 

.110921 

.112956 

.1327 


,0118 

L4 

.149727 

.157043 

.1715 

.0073 

.0218 

1,2 

.194186 

.206951 

.2099 

.0128 

.0157 

1,0 

.241971 

.259120 

.2505 

.0171 

,ms 

o,.s 

.289692 

.308958 

.2897 

.0193 

.0058 

0.6 

.333225 

.351538 

.3366 

.0183 

.0033 

0.4 

.368270 

.382453 

.3776 

.0142 

.0093 

0.2 

.391043 

.398583 

.3907 

.0075 

.0004 

0.0 

.398942' 

,.398859 

.3989 

.0001 

.0000 

0.2 

' .391043 

.383157 

,3906 

.0079 

.0005 

0.4 

• .368270 

.354545 

.3688 

.0137 


0.6 

' .333225 

.316273 

.3299 

.0170 


0.,8 

.289692 

.272360 

.2842 

.0173 

.0055 

LO 

.241971 

,226714 

.2323 

.0153 

.0097 

1.2 

.194186 

.182641 

,1803 

.0115 

.0139 

L4 

.149727 

.142563 

.1319 

.0072 

.0178 

L6 

.110921 

..107939 

.0908 

.0030 

.0202 

L8 

.078950 

.079354 

.0717 

.0004 

.0073 

2.0 

.053991 

.056702 

.0452 

.0027 

■KSSi 


♦These columns were taken from Luis R. Salvosa’s tables. Annals of 
Mathematical Statistics, Vol. I, No. 2, p. 64 et seq. 

t These values were calculated by interpolating in the abtwc mentioned 
tables. 


The ordinates of the normal curve, and trans- 

formed by the transformation determined by the first four B*S 
are compared in Table I. 

The ordinates of the transformed distribution are much 
nearer those of the normal curve over an interval that includes 
seventy-five per cent of the frequency but for the rest of the 
range considered the agreement is not so good. These facts indi- 






A. BAKER 


119 


cate that more ternis of the transforming function must be taken 
ill order to secure close results for large values of |ti-| , 

It is diflictilt to set up a rigorous criterion as to the number 
of Bs necessary to define adequately the transforming function, 
l)iit the following considerations are of value in this connection^ 
Suppose that J^(x} may be adequately represented in the 
interval €i^x{ {r by r?7y terms, Le. 

+ A, X ^ X%. 

and that m is large enough so that the first ?rv terms of the ex- 
pansion of the normal function give an adequate representation 
of it. This is clearly possible since the expansions with which 
we are dealing converge uniformly in the open interval. Then the 
first ^ 6s may be determined so that the first in. terms of 
■^Ti.)d7L transformed by the transforming function determined 
by the ttl 13 s are identical with the first terms in the 

expansion of the normal function. In addition there will remain 
certain terms which may cause a serious discreixmcy. For -fCx-jc/x. 
becomes 

yn, 

> 

Let us assume that all ^ o ^ iyrri^ and investigate the 
terms in to of degree higher than rw . 

Since the first terms of f<)c)dx. transformed contribute few 
terms involving n}i7Vt and the higher order ^ terms have 
small coefficients, it is to be expected that if in Bs are used a 
good result will be obtained, at least for moderate values of ^ - 
Some skewed distributions that differ considerably from 
normal may yield a rapidly converging sequence of Bs t that is in 
case there is a natural relation of this kind existing between the non- 
normal and normal distributions. 

The main reason for investigating the possibility of an easily 
determined transforming ' function that will ,,tonsform a 'non- 



120 NON-NORMAL FREQUENCY DISTRIBUTIONS 

normal distribution into a normal distribution is the fact that the 
distributions in random samples of estimates of the parameters 
of the non-normal distribution can be expressed in terms of the 
transformation and the sampling distributions of the parameters 
of the normal distribution into which the non-normal distribution 
is transformed. This proposition is developed in Part 11 . 

PART 11 

DISTRIBUTION OF THE ESTIMATES OF THE 
PARAMETERS OF NON-NORMAL 
DISTRIBUTIONS 

Suppose that a variable x is distributed as '^) cLx. where 
is such that it can be transformed into a normal distribution 
by means of a quadratic transformation, 

( 1 ) X = CL^ -h 
Then ^(x.)dtK becomes 

( 2 ) J , 

where ^ is normally distributed. 

The total of a sample of n xi drawn at random from JCx) is 

(3) 

which by virtue of (1) becomes 

( 4 ) - + + 

The coefficient of a. in (4) is an estimate of the total of a sample 
of of a normally distributed variable and the coefficient of ^ is 
an estimate of the second moment about a fixed point of a nor- 
mally distributed variable which can be written as an estimate 
of the standard deviation squared plus the estimate of the mean 
squared. Thus (4) can be written as 



G. A, BAKER 


121 


where the bar over irt and denotes estimates of these para- 
meters by means of samples. The distributions of w and r are 
known and are iiidependent. If the mean of distribution (2) is 
taken to be zero, then m is distributed as proportional to e 2 ^ 
and is distributed as proportional to 


(5) e 


71 (CL+Z i 


f-ho. ^et+ 






\/ a V V <ry 


-_Si. < ot?. 

z& ^ 


The distribution of is. except for a constant factor, 


( 6 ) 


77 3 - 

— 2 "' Z?!- ^ , 

£ e ^ ^ ^ z- 4 


if. 77 ^2. 

If two variables, x and ^ , are distributed as dx- cl^ ^ 

then the proliability that a value of f (^j^) - ^ 
is indi/is given as the surface area of the cylinder be- 
tween and times dif 

In this case the probability function of ^ is pro- 

portional to 


if 


(7) 


r — iip: 






'■h 


\/aF+^iT^ 




Put and (7) becomes 







® Baker, G* A. — Ixaiuloni Sampling from Non- Homogeneous Popula- 
tions, il/rfran, \’'ol Mil, No, 3, Feb. 1930, 



122 


NON-NORMAL FREQUENCY DISTRiBUTlONS 


If /(x) can be transformed into a normal distribution by 
means of a cubic transformation. 

(9) X= ^ 

then (3) becomes 

(10) a ^ 

which can be written as 'tl (a.?rZ irm ^cm t3C & € j ^ 


Hence the means of samples of 7t are distributed as proportional 
to 



( 11 ) 

(3CX^^)^ 

where X/B represents the interval or intervals' for which 
c xF and ^cXf* ^ have the same signs and 
iJ varies from to cxo , 

Suppose that the given frequency distribution can be trans- 
formed into a normal distribution by the transformation 

and consider the expression for the estimation of the standard 
deviation squared of the ^^-distribution from a sample of » ? 


( 12 ) 


y » 2 

(X. 


■■ t X, 


n- 








7 % 





6 '. A, BAKER 


123 


which tecomes 


2 

(13) [ (gj^ ^-hCa^^i- %;j] r 

TV ' IZ 72, -J 

where ^ is normally distributed. In terms of the estimates of the 
mean and standard deviation of the V5 (13) can be written as 

^ I Ttm ^ 

zS- ^ -t CL^ air r ^ r M , 

Hence the estimates of the standard deviations of the original 

population will be distributed as proportional to 



^ ^ ^ <0,%. ya Ixx f V ^ x*") 6 •6- V J 

where i/ varies from 0 to f*oo . 

The distributions of the estimates of other parameters and 
the distributions of the estimates of the mean and standard devi- 
ation for different transformations can be expressed in terms of 
the distributions of the mean and standard deviation of the re- 
sulting transformed normal distribution but it is obvious that the 
process becomes complicated. 




INVARIANTS AND COVARIANTS OF CERTAIN 
FREQUENCY CURVES 


Richmond T, Zoch 


Introduction. After the most convenient type of equation 
y = f a, bj has been selected and the parameters a, b, 

O . * * M hi the selected equation have been determined so that for 

a given set of values 1, 2, n), the computed values 

3?^ (i = 1, 2, . . agree as closely as possible or as closely as 
is consistent with the observed values K (f = I, 2, * . it may 
be desirable to make one or more of the transformations: (1) 
move the origin, (2) use a different scale (unit of measure), 
(3) change the total frequency. 

This paper discusses certain invariants and covariants of the 
above transformations which were noted in developing the general 
theory for the Pearson Curves of frequency. 

1. Change of Origin, Instead of considering the diff. eq., 

(o ^ = ^(x-r) 

^ X.^ +■ ^ X ^ 

which is the diff. eq, from which the Pearson curves are derived, 
we take the more general diff. eq., 


( 2 ) - 1 / 

Equation (1) is a special case of equation (2). 

Make the following substitutions: 


x-X + F, 




( 3 ) ; 


F f . . , . , -t 4- - 3 



RICHMOND 1\ ZOCH 


125 


and on simplifying we obtain, 

( 4 ) - iJi - 

If we now write 

( 5 ) X - X- F 

we have : 

( 6 ) = — . 

The solutions of equations (4) and (6) can be written in 
the form 

( 7 ) ^ ^ GC%) . GU-r), 

where P is the mode as will be observed from the difF. eq. In 
other words the frequency function is a function of (x-p) 
when it is written in the form of eq- (7). Therefore if we change 
the origin of x by writing x = x~/t all of the constants of the 
frequency curve will remain unchanged if at the same time P be 
subjected to the transformation P-JF-k, . 

2. Change of Total Frequency. Let be the constant of 
integration when the area under the curve is unity and when the 
argument is X = /L the constant of integration when the 

argument is X - for an arbitrary area under the curve; and 
/I/ the total frequency. Now whei^ the total frequency is changed 
the area under the curve is changed, hence from the above defini- 
tions 

( 8 ) 

Therefore if the total frequency be /!/ and it is desired to write 

the equation of the frequency function for a total frequency |)f 

Af then write /C for where 

(9) /c - CK./V)i-/V 

and leave all of the remaining constants unchanged. 

It should be enqdiasized that in leaving the remaining 'con- 



126 


INVARIANTS AND COVARIANTS 


slants unchanged we assume that the distribution of the new 
sample or the universe obeys tne same law as the old sample. 
Occasionally one sees the statement ui works on probability and 
statistics in cunnection with the Theory of Errors that as the 
number of observations is increased indefinitely, the arithmetic 
mean tends to the true value of a distribution. This statement is 
based upon the tacit assumption that an observation less than the 
true value (most probable) is as likely to occur as an observation 
greater than the true value. If we make this assumption we will 
always (if the number of observations be sufficiently large) ulti- 
mately obtain a symmetrical frequency curve (the A. M- coincides 
with the axis of symmetry) and this assumption contradicts the 
assumption that the distribution of the new sample obeys the same 
law as the old sample (except the old sample itself be symmetric- 
ally distributed). 

3. Change of Scde. We are now ready to consider the 
behavior of the constants when the unit of measure is changed. 
Perhaps it is well to point out here that quite often it is desirable 
to change the unit from months to years, from feet to yards, from 
pounds to grams, etc. The behavior of the constants under a 
change of scale is not as easily arrived at as for the changes of 
the origin and total frequency. 

The behavior of where is the coefficient of the high- 
est power of X in F(X) of the differential equation,..:^*. , 
will first be obtained. 

Elderton^ uses moments to determine the constants of a fre- 
quency curve, Thorkelsson^ and Fisher® have used Thiele’s semi- 

^W. Palin Elderton, “Frequency Curves and Correlation”, Second 
Edition 1927, London. 

^'pnorkel! Thorkelsson, “Frequency Curves Determined by Semi-In- 
variants” (Visindafelag Islendinga IX) Reykjavik Rikisprentsmidjan Gu- 
tenkrg.--MCMXXXL^ 

3 Arne Fisher, “Frequency Curves”^ Transiated by E.. A. Vigfus&on, 
American Edition, 1922, 'The' Macmillan Co. 



RICHMOND T. ZOCH 


127 


invariants for this purpose. Semi-mvariaots have an advantage 
over moments in that the values of the higher semi-invariants do 
not change when the origin is changed* Moreover Fisher (pp. 
12 - 16 , loc. cit.) has pointed out how the semi-invariants behave 
when the unit is changed, viz : 


A^Caxi-c)= u 

X^(x) lyi. 


Referring to equation (2) let 7 ^ be the value of F when 
the origin is at the arithmetic mean, and let , 2^., , . . . 4^, 

and S; be the values of 4-^ , , . . . , and ^ when the 

origin is at the arithmetic mean. Now Thorkelsson (loc. cit.) has 
pointed out that when his method is used for computing the con- 
stants of the curve there will be only one equation involving 
and only one equation involving 4'- Moreover the coefficients 
of the and the constant terms of the remaining equations 

will 1)6 of constant weight. 

Below is an example of the equations obtained when Thor- 
kelsson's method is used to compute the constants : 


f'P + C 

K+ 

A3 ^ 1 ^ ^ A3 ^ ^ 5 ) C = o 

A^ 4 - 5 X, ^ A.z-z/sX ^ - o 

f "*■ ^ jt. j? 5 



128 


INVARIANTS AND COVARJANTS 


Note that only the first of the above equations involves ^ and 

s f 

only the second involves % . 

Since the coefficients are of constant weight they are invarV 
ants^ of indes ^ where w is the weight of the coefficient when 
X is subjected to the transformation ;c'- a. x c. . 

Suppose that we now consider the general case where F(^) 
is of degree iz * Hence, in general, equations (10) will consist of 
7/f-z equations in rtt2^ unknowns ; the unknowns being , 
^ ^ • Disregard the two equations which involve F 

and then there remain equations in rt. unknowns. Observe 
that the weights of the coefficients of the F form an A.P, whether 
taken by rows or by columns. Also the weights of the constant 
terms form the same A.P. as the columns. 

We now state the 

Lemma : If all of the elements of a determinant are covari- 
ants and the weights (indices) of the elements of every row form 
an A.P. and of every column form an A.P. then when the deter- 
minant is expanded every term is of constant weight (index). 

Proof : Let the A.P. formed by the weights of the elements 
of the rows be vt/ . = ^ (i-t) S ^ 

Then the weights of the elements can be displayed as follows : 


«-/ 








CL^ 3 S' . # 



S' 


a^i-N . • 




CL ^ 

^^3^ .. 



(It should be emphasized that the above is not the determinant 
mentioned in the statement of the Lemma but the elements of the 
above artay represent the weights of the elements of the determi- 


"Theories’V''BenJ.v'H.'^Sau^^ & 
1920; 'Chicago*' 'Chapter 'ly ^ 



RICHMOND T. ZOCH 


129 


nant). Now since by hypothesis the elements of every column 
have such weights that the weights form an A.P. then , a^, 
a^,...CL^ must form an A.P. Let this A.P. be 
Making use of this notation the weights of the elements of the 
determinant can be displayed as follows : 

a-, a., i-^ ... 0-1 -h (-*1-/ ) y 

a^,+S' f-J” i- T ■ • ■ a-i r f- S' 

a.i-i~(77-/)S a-f-S (??-f) S' 

Hence the weight of the element in the row and the /’^col- 
umn is CL^-t (L-t)'blQ-t)'5 . Along the principal diagonal of the 
determinant i -J , Therefore when the determinant is expanded 
the weight of the term consisting of the elements of the principal 
diagonal is the sum of the A.P. vv< = a, +■ Ci.~i)C S -t S) or 

r _ ~i 

^ ^ (^-f)cy+ y)J ^ w. 

Every term in the expansion is of weight Vl/ because each term 
consists of one element from each row and one element from each 
column and hence the weight is equal to the sum of two series, 
each being an A.P., plus the weight of the term in the upper left 
corner. 

Theorem: If all of the coefficients and the “constant” terms 
of a system of ^ linear equations in tz . unknowns are covarxants 
of such respective weights (indices) that the weights (indices) of 
the elements of every row of the matrix of the system of equa- 
tions form an A.P. and of the elements of every column of the 
augmented matrix fo^ an A.P. then the solutions are covariants 
whose weights (indices) form an A.P. whose common difference 
is of the same magnitude but of opposite sign to the common differ- 
ence in the A.P. of the weights (indices) of the elements of the 
rows. 



130 


INVARIANTS AND COVARIANTS 


Proof : By Cramer’s rule the solutions are 


Z- - where A = 
*• A 


K, 


K, 


K. 


■ni 


K, 


and where 


is the 7E -rowed determinant obtained from A by replacing 
the elements of the column by the ‘^constant’’ terms of the 

system. Let the weight (index) of the element in the t^^row 
and i column of A be a-, f) ^ . Also let the 

^ ^ f-h- 

A.P. formed by the weights (indices) of the elements of the 
row of A be , hence in particular the A.P. 

of the weights (indices) of the elements of the first row are 
a.^-h(c-i)S' , Further let the A.P. formed by the elements of the 
column of constant terms of the augmented matrix be 
By the lemma j ust established we see that when is. is expanded all 
of the terms of the expansion will be of the same weight (index) 
W . Hence A. is of weight (index) W . Also since the A.P. 
of the weights (indices) of the column of constant terms is 
w^l - 0 ^ then the weight (index) of each term in the 

expansion of will be - £a.^ ^ J different from 

the weight (index) of each term in the expansion of W . Hence 
the weight (index) of II is V\l - Therefore 

the weight (index) of is 
and the theorem is established. 

Applying the above Theorem and observing that / in 
equations (10) we obtain the result: 

weight of - 3-Z i- * 

Sindf we have the result that when x is 'Subjected to the ^ 

'transformation V, then' 33 ^ Or in other 

'.words'' mm;inmrimt of index jz-tz^ , under;’ the transformation 

consideratiou'/of Here we .have 



RICHMOND n ZOCH 


m 

/ equations in nf-f unknowns and the augmented matrix 
has elements of the following' weights means that an element 
is lacking) : 



0 



3 - • • • 

n-i 



'X 

3 


5 .... 


3 


3 


5 

C, .... 

VfX 

4 


if 

V 

5 

G 

7 .... 

77f3 

S 


(Tti-l} 

f 

(?ltz) 

(7lf3) 

('rjf'f) . • . • 

21? 

(ni-x) 


Now C 

is the quotient of two determinants 

formed 

from 


the above matrix and if these two determinants be expanded in 
terms of the minors of the first column we see that the weight 
of ^ L That is is of weight 1 regardless 

of the degree of F(X)* Therefore is an invariant of index 1. 

Next considering ixj the augmented matrix has the same 
elements as for except that the first row is now : 

•4 Z 3 H -n-i Z 

FoEowing the same procedure we see that the weight of 

CWfSj-W = 2- Therefore C is an invariant of index 2. 

We can look upon equations (3) as a transformation. We 

can reverse this transformation by solving for the in terms of 

the 3i . Also, by moving the origin to the A.M. equaticms (3) 

may be written: 




132 ^ INVARIANTS AND COVARIANTS 

III equations (11) > . • - ^ are the values of ^ , 

^ • • •» 4 when the origin is at the A.M. 

Note that the right hand numbers of equations (11) are iso- 
baric and that 13^ is of weight 2; of weight 1 ; of weight 0; 
and 111 general is of weight 2^-1 . 


Now let 




, in general 



; hence 


, Therefore when the fs are computed we note that is of 
weight'll ; is of weight and in general is of weight 

(z-c ) -(x- 77 ) - ? 7 - ^ . Since is the product of all the roots, 

the sum of the products taken at a time and so on and 

is the sum of all the roots (due consideration being taken 
of the signs) it follows that all the roots of /Yxjare invariants of 
index 1 under the transformation ^ cLX.f-c^ 

Now if equation (4) be solved in the form of equation (7) 
then it can be seen by actual substitution of the indices of 3 and 
the roots of 3(X) which are involved in the constants that the 
exponents k and t of factors of the form (^i - ^ and 


4 4 tan 

e 


are invariants of index zero. The 


factor//- occurs for a real root Ay of F^'X’jand the factor 


44 ^^^ 

fir 

y ^ conjugate cora- 

plex roots of Ft%}- The fact that the exponents K, are mva^ants 
of index zero will be generalized for the cage where cpwplex 
roots do Hot c«:cnr and wbei« no iml root is repeated; ^ ^ ^ ^ 



RICHMOND T. ZOCH 


133 


If complex roots do not occur the differential equation 




= .y 


can be written » — f- 

~t FCx) 4 , 1 ; 


.X+v2 




Vh. 


FCx) y 

where in separating into partial fractions and equating 

coefficients of like powers^ of X we obtain ri equations in n un- 
knowns -and since the roots are all of weight 1 the weights of the 
augmented matrix will be (the unknowns of the system are the 

in^): 


77- f 77- f 77-! 77-! ^ 

7i-7L 77-7. ‘ n-jL 77-7- O 

. ^ 

O O o o ^ 


Applying the Lemma we see that the are all of the same 
weight (since = 0). Expanding the determinant in the numer- 
ator by minors we see that the are of weight r?- 2. Since 
is of weight 2-77 , we have is of weight zero. There- 

fore the N- are invariants of index zero under the transformation 

= CLTCi-C. , 

We have now considered all of the constants of the curves 
except the constant of integration. Let the solutions be written in 
the form: y~ F.^Gr(x). 

Now it is possible to write G-(K) in such a form that G-CX) 
is a covaricmt of index zero under the transformation X =•£«- X . 
In the case of real roots this is accomplished by dividing both the 
numerator and the denominator of each partial fraction by the 
root involved in the fraction before the integration is performed. 
Partial fractions viihich involve complex roots can be similarly 
treated- This is the way Pearson actually treated his Types I, II, 
III, IiV, and VII curves although he did not deal with his Types 
V and VI curves in this manner. 

After we have our solution in the form which makes GTXj a. 
covariant of index zero then if we write X for <a.X the total 



134 


INFARIANTS AND COVARIANTS 


freqoeiicy l>etween 77 / and X will be the same as the total 
frequency between 77a X and (>rff)a X • Therefore y is a co- 
variant of index (—1). Hence is an invariant of index 

(^1). 

An example will now be given. Take the equation to which 
Eldertoii (loc. cit.) fits a Type I curve on pages 54-59. He has 
used a unit of 5 years. Suppose we wish to change to a unit of 
1 year. Then the constants and being the roots of /“" ( are 
invariants of index 1 and are each multiplied 5 and become 
9.98190 and 67.63640 respectively. Since TTf^ and 777;^. are invari- 
ants of index zero they remain unchanged and are as he gives 
them viz, .409833 and 2.776978, The constant of integration be- 
ing an invariant of index — 1 it is divided by 5 and Ijecomes 29.892. 
The equation with a unit of 1 year becomes (See top of |)age 58) : 


.409833 2A7m7B 

5 “ 938190 j ' 67.63^0 J 

Suppose that now we wish to move the origin to age 26,75942. Tlien 
the above equation becomes : 

.409833 ^ 2.776978 

M = 28.892 f/^^' -26.75 942_ 7 ( X-26.75941. ? 

J 9.98190 j \j 67.63640 / • 

Finally suppose we wish to change to a total frequency of 2000 
instead of 1000 as in the given sample. Then the equation be- 
comes : 


5 * 59.784 1 ",+ 


^- 26.75942 
9.98190 


.409833 

1 ■ { 


/ - 


x-iusiiz ? 
67.63640 J 


2.776978 


4- Conclusion : Benefits of this Information. If the diff. eq. 

(2) be written in the form (4) by means of the transformation 

(3) then the integration is more easily accomplished. That is to 
say: in general the solution in the form of eq. (7) is more readily 
obtained from (4) than some equivalent form of solution would 
be from (2). Thus a solution in the form of eq. (7) is not only 



RICHMOND T. ZOCH 


m 


more easily obtained but also lends itself readily to a change of 
origin. 

, ’Each type of Pearson's Curves may be written in a number 
of ways. The numerical example given above shows the con- 
venience and advantage of writing a solution so that the origin is 
at the mode, G()^) is a covariant of index zero, ^ a covariant of 
index ( — ^>1) and the constant of integration an invariant of 
index ( — 1). 

Regardless of what form is selected for writing a solution 
the solution will be a covariant and the constants will be invari- 
ants,- but not necessarily of the indices mentioned above. A 
knowledge of these invariants will save much labor if it is desired 
to make a change in the unit of measure. 

Similar laws of transformation can be worked out for (1) 
solutions of the diff. eq. ^ ,^^^^where both and 
are integral rational functions of x and (2) for the Gram-Char- 
lier Types A and B series. In the first case we obtain the same 
- result as outlined above for the simpler diff. eq. ; that 

is the solution may be written in the form &(%)'wh.tvt 

GiXj is a covariant of index zero, ^ a covariant of index (— 1 ) 
and is an invariant of index ( — 1). In the case of the Type 
A series the coefficient of each term is an invariant of index zero. 

George Washington University. 



QUADIATURE OF THE NORMAL CURVE 

By 


E, R. Eklow 


There are three formulas for the calculation of areas under 
the normal probability curve, only two of which seem to be 
generally recognized in American statistical circles. Herewith is 
presented an outline of the mathematical development of these 
three formulas and a determination of the bounds of practical 
utility of each. 

The well-known equation for the normal curve, 

* 

may be expanded into the series 






y = 6%) "" i: J 


(Ref. 3) 

by means of Madaurin's Theorem. (See any good calculus text- 
book.) (7) This expansion is readily accomplished by making 
the substitution 


t-- ^ 


so that 



and jj- 

fa)- 

The process of successive differentiation is quite lengthy, 
since every other term differentiated becomes zero and therefore 
2n terms in the Maclaurin series are required to produce n terms 
in the new series. After the expansion has been carried to five 
or six terms, a regular law of formation becomes evident from 


inspection of' the, new series 

"i" ' i' , 

.-'t , jJ 


/ ' 



E. R. ENLOW 


137 


nth term 


After making the reconversion t 

and substituting the value of e 
the iioniial curve, we have 


'in the original equation for 


u = 

a <f i/Jk 


^ / ■ Tb ^ f 


(Ref. 1, 7) 


as previously indicated. This series is uniformly convergent and 
may therefore be integrated term by term. 

The area under all or any portion of the normal curve is 
calculated from the integral of the equation for the curve: 

f f 5?^"^ t 

J ^ J ^ €7 x . 

To simplify the procedure, let x = (T- 1 / 2 '» 






e. ^ crr^ ^ c/t 


r ^tr 
e eft 


( wf}t?1 /v-t). 


The value of the definite integral representing the area between 
the ordinate at the mean and the general ordinate whose abscissa 
is t is t J=- 1 


d t. 


Then, int^rating the expanded series for ^ (above) term 
by term, we have: 


it f I 


t - t + 


, A*" , t 

J~t tl 


tJi- 13 /V 


t, .. 


2 at 


•21 ^ 


S' ft is 



138 


QUADRATURE OF THE NORMAL CURVE 


HI this senes the value of t - and keep- 
ing’ the expression separate, we obtain : 



This series may be extended indefinitely, since the general term 
is seen to be " 

' (2 yi I ) I 

It will be referred to hereinafter as Series A. 

A published statement that Series A is divergent when iry / 
is erroneous (7). It is always convergent, regardless of the value 
of the deviate, but converges very slowly when t is not small, 
in which case it is better to use another series obtained by inte- 
grating by parts. (1,7) 

We may write 

/e”Vzt = //-/£ TyJ 

- /I-/. 


Then 



© @3 




(Formula) \Jcfcfu\ 



Integrating the integral expression in the last term above, i. e., 

in like manner (by parts), another term appears, and the equation 
above 'becomes: , 






E. R. ENLOW 


139 


Continuing this process by breaking up the integral on the 
right into another term in the series plus a new integral, repeated- 
ly, produces the infinite series: 




€ dt - -f _e _ 3c. 3.5 e _ 3.5-7 c 

Zir 4t:^ 


Now 


But 


_ t 

e /./ 


(zi^T (zi^) 

J € Jt . 


y- + 


A 5.^- ^ /-^.iT.V 


■J 


X- oa 

/■^ !(r’' / 

J€ eft ^ j ^ 


I part 


I whole 


part 


. 2- 


(Ref. 4) 


Therefore 


a>t 


kff 


J 


^ Jt. 


And since the value of the definite integral 

=/-f? 


/’3.3~ 


it 

Then 


(zef 


OO 




and, since 


-d- L 

irexT 'Z 


X. 

^c/x 


drr 


/< 




(when //x / and t : 




) 


♦Good proof in The Encyclopedia Brittanka, American Edition, 1B% 
in article -on Infinitesimal ' Calculus. Also '(2). 






140 


QUADRATURE OF THE NORMAL CURVE 


A- 

fff 


■f/ 


ir 

e 


[ 


_e 

if)f 


/ 




f- 




Substituting 

becomes : 

for t and 



, e 1 

^ f-& 1 



^jS- -t- 


7=7-- jj. 


w ' Sf ' ( fe 


this series 


fj-/ 


This series will be referred to as Series B. It is asymptotic 
or semi-convergent, (S) (8), a type of series which is frequently 
obtained by integrating by parts (6). Series B is divergent for 
values of below unity. Weld (7) states that this series 

converges rapidly when €z ^ but does not mention its peculiar 
asymptotic nature whereby it converges until a minimum term 
is reached and then diverges. As Townsend (6) indicates, the 
best approximation of the sum of an asymptotic series is obtained 
if the series Js terminated with the term having the smallest 
absolute value. This minimum term is the second term for 

^ and ' while the error due to dropping the succeeding 
terms ris less than the last term retained, still this second term 
has too large a value to permit any very accurate calculation of 
the area (as will be shown later). However, the accuracy in- 
creases as ^ takes on larger values, since it then takes longer 
for convergeiice to the minimum term ' and this minimum term 
also grows smaller, , 

Brunt (i)^ advocates the use of another series,', developed 
by Schlomilch, , which he:'$tates; is ■■ better; when is krge. This 



E, R, ENLOW 


141 


Schlomilch series, hereinafter referred to as Series S, is as fol- 


lows, in terms of ^ 
~T.r r 








5 ^ 1 

_ 3/5 ^ 


Series S is readily developed from series B by transformation 
of successive terms in the B series to terms with the cliaracteristic 
Schlomilch denominator. This is more easily accomplished when 
series B is in the t (- form. 

To determine the limits of practical utility of each of these 
three series, actual calculations of areas were made at appropriate 
intervals, with results shown in Table 1 and Figure 1. All 
calculations were made ‘Ty hand” and carried to lo or more 
decimal places. 

The three series (formulas) were used in the following 
forms : ’ 



3H5(o , 4XJi4d 5^^0 46 nS'^12£i€> 


BS3 oo<f ^ 4^0 , ito 337 7 ^^ c 


♦This term not given by Brunt (1)» but calculated by present writer. 
Last, term ' practicable to. use. since next term also has plus sign. 



12 


QUADRATURE OF THE NORMAL CURVE 


ERIES B:— 

Area J ‘ 


X 

2 


/aj [o- I fo-g 


X 

Of" 


€ 

'x. 


L (,HiH ^S'l O f-. 3?^ os-f jj£/ - ^ 


r 

;5" 4_ __ j_ io~6HS ri.5 

r™'l''"T7_ ^ ^ R -v ' _ “T”" "*■ 


»“ ■ c?) 


■X \/V 






0X1OZ5 3VV57VjZ5 . G5//73.<fei^ >37 li>5 IS^ 

- ■ ? — ^ rrr- r-r f ' ' 


X \ 


m 


(f) 


n 


tC \>c> 


Ct) 


cfr^ 


ERIES S:-^ . 

Area J ^ /^ - ^ ^ V 


l-.S^ff 934 X}l-f/ A- 




" J ( fT* to} ^ 


Table i shows the numbers of terms required in each series 
y give areas accurate to 3, 4, S, 7 and 9 decim^^^ places, respec- 
ively, for values of ^ ranging from 2$ to 5.00. Calculations 
/ere checked by Sheppard^s Tables (4) and the accuarcy was 
etermined on the principle that the error is less than the last 
srm retained or the first term dropped’*'. Where x is used it 
rdicates that the series cannot give the accuracy indicated. 

The graph, Figure 1, shows the approximate domain of 

'vf This .dees hot hold strictly 'true 'of Series 'S, since: it' is a modification 
c the '"true "Series B. 



£. R, ENLOW 


143 


Utility of eacli series under conditions of accuracy ranging from 
3 to 9 decimal places. Note that Series S covers a wider range 
than does Series B, including the entire domain of Series B. 
Hence we may conclude that, while it is essential as a basis for 
the derivation of Series S, Series B may be discarded for area 
calculations* Moreover, Series S is not only more valuable than 
Series. B because of its wider range of utility but also because 
its more rapid convergence gives a desired degree of accuracy 
with fewer terms than are necessary with Series B, 


FIGURE 1 

Domains of Practical Utility of Three Infinite Series in Calculat- 
ing Area Under Normal Probability Curve. 



As an illustration of the use of this graph (Figure 1), we 
note that for five-decimal accuracy Series A must be used for all 
values of ^ up to 2.50, and that Series S may be used for 
X =: 2.50 and all larger values. Table 1 shows that the number of 

terms required in Series A for five-decimal accuracy increases 
from 3 at ^ = .25 to approximately 14 at ■^ = 2.25 ; while, 
beginning at ^ = 2.50, Series S requires but 4 terms, and this 
number diminishes to 1 term for ^ = 5.00. 



144 


QUADRATURE OF THE NORMAL CURVE 


TABLE L 

Numbers of Terms Required for Varying Degrees of Accuracy in 
Calculation of Increasing Proportions of Area Under 
THE Normal Probability Curve. 



3 decimals 

4 decimals 

5 decimals 

Dl 


9 decimals 


A 

B 

m 

Dl 

1 

m 

D 

il 

m 

ra 

m 

m 

Dl 

m 

S 

.25 

2 

X 

i 

2 

% 

X 

3 

K 

X 

4 


X 

5 

X 

X 

.50 

3 

X 

i 

3 

X 

X 

4 

X 

X 

S 

X 

X 

6 

X 

X 

.75 

4 

X 

X 

4 

X 

X 

5 

X 

Y 

7 

X 

y. 

8 

X 

X 

1.00 

4 

X 

X 

5 

K 

X 

6 

X 

X 

8 

Y 

X 

9 

X 

X 

1.25 

5 

X 

X 

6 

% 

X 

7 

y 

X 

9 

X 

X 

11 

K 

X 

1.50 

7 

K 

X 

8 

Y 

X 

9 

X 

X 

11 

y 

X 

12 

X 

X 

1.75 

7 

X 

5 

9 

y 



y 

X 

12 

X 

X 

14* 

X 

X 

2.00 

9 

X 

3 

10 

K 

4 

12 

y 

7 

14* 

y 

X 

16* 

X 

X 

2.25 

11 

/ 

3. 

12 

X 

3 

14* 

X 

7 

IS* 

K 

X 

18* 

X 

X 

2.50 

12 

2 

2 

13^ 

X 

2 

IS’^ 

V 

4 

17* 

y 

A 

20* 

X 

y 

2.7S 


1 

1 

- 

•)( 

2 

- 

X 

4 

19* 

X 

X 

23* 

y 

X 

3.00 

- 

1 

1 

- 

3 

2 


X 

3 


X 

4 

27* 

X 

X 

3.50 


1 

1 


1 

1 ^ 

— 

3 

2 


y 

4 


y 

7 

4.00 

-- 

1 

1 ^ 

- 

1 

1 

- 

1 

1 

— 

5 

3 


X 

6 

5,00 

- 

1 

1 

- 

1 

1 

- 

1 

1 

- 

1 

1 


3 

2 


3 decimals 

4 decimals 

5 decimals 

7 decimals 

9 decimals 


* Estimated by graphic extrapolation. 

Explanatory: Read table as follows: The number of terms required 
in Series A to calculate to 4 decimal places of accuracy the portion of the 
area under the normal curve lying between the ordinates at ^ and 
^ * 2,00 is 10; with Series S it is 4; the calculation is impossible to 
this degree of accuracy with Series B. 

Notes: ^ indicates impossible calculation, —indicates impracti- 
cable calculation, 

CONCLUSIONS 

All areas under the normal curve may be calculated by the 
use of Series A and Series S, the two being complementary. 

Methods of developing Series A and Series B are outlined 
and it is indicated that Series S is derived from Series B. 




















E, R, BNLOW 


14S 


The domain of practical utility for each series is shown in 
Figure i. The numbers of terms required for various degrees 
of accuracy are shown in Table L 

SELECTED REFERENCES 

L Brunts David, The Combination of Observations. Cambridge University- 
Press, 1917. 

2. Elderton, W. P., Frequency Curves and Correlation. Layton, London, 
1927. 

3. Kelley, T. L., Statistical Method. Macmillan, 1923. 

4. Pearson, K., Tables for Statisticians and Biometricians. Part I, Second 
Edition (1924). Cambr. U. Press. 

5. Rietz, H. L., Editor, Handbook of Mathematical Statistics. Houghton 
Mifflin Co., 1924. 

6. Townsend, E. J., Functions of Real Variables. Holt, 1928. 

7. Weld, L. D., Theory of Errors and Least Squares. Macmillan, 1916. 

8. Wilson,. E. B., Advanced Calculus. Ginn and Co., 1912. 



EDITORIALS A- L. OTOOLE 


ON A BEST VALUE OF R IN SAMPLES OF R 
FROM A FINITE POPULATION OF N. 


In recent years the problem of finding the moment coeffi- 
cients for samples of /L drawn from a finite population of has 
been of interest to so many writers^ that it seems worthwhile to 
make a few further observations*'^ concerning these moment co- 
efficients — particularly with respect to their dependence on /i * In 
many instances the value of Ji. to be used is at the discretion of 
the investigator and he would like to know if there is one value 
of A which is better than any other. An answer to that question 
will be given here. 

H. C. Carver, On the fundamentals of the theory of sampling, An- 
nals of Mathematical Statistics, Vol. I, No. 1, pp. 101-121 ; Vol I, No. 3, 
pp. 250-274. 

C, C. Craig, An apolication of Thiele's semi-invariants to the sampling 
problem, Metron, Vol Vll, No. 4, 1928, pp. 3-74. 

R. A. Fisher, Moments and product moments of sampling distribu- 
tions, Proc. London Math. Soc,, Series 2, xxix, 1929, pp. 309-321 ; xxx, 
1929, pp. 199-238. 

L. Isserlis, On a formula for the product moment coefficients of any 
order of a normal frec|uency distribution in any number of variables. Bio- 
metrika, xii, 1918-19, pp. 134-139. 

P. R. Rider, Moments of moments, Proc. of the National Academy 
of Sciences, Vol IS, 1929, pp. 430-434. 

H, E, Soper, Sampling ■ moments of samples of n units each drawn 
froin an unchanging sampled population, from the point of view of semi- 
invariants, Journal of the Royal Statistical Soc., Vol 93, 1930, pp. 104-114* 

A. A. Tchouproff, On the mathematical expectation of the moments of 
frequency distributions, Biometrika, xii, 1918-19, pp. 140-169 and 184-210; 
xiii, 1920-21, pp. 283-295, 

A, L. O’Toole, On symmetric functions and symmetric functions of 
symmetric functions, Annals' of Mathematical Statistics, VoL II, No. 2, May 
1931, pp. 102-149. See Chapter III. 

^ ^xhese observations arose, as a result of some very far-reaching sug- 
gestions on 'the; theory' of sampling m^ade by .Professor Carver; recent 

'conversations: with 'Lhe. writer. 



A. L. O'TOOLE 


147 


The differential operator method developed by this writer^ 
for finding the monient coefficients not only was a very simple 
method but had the added advantage of leading directly to some 
theorems whose generality had not been established previously. 

Using the natation of the previous paper let the finite parent 
population of /v be composed of the /y variates x, , ^ ^ 

...... . From this population draw all of the different 

samples and let , 4 := 1, 2, 3, . . . where 

^ x: designates the sum of the rvalues of x which appear in 
the i.'^^sample. With this notation it has been shown in the paper 
cited that 


X. J K 

n-f’rL 


(1) 5 = ^/ T Y -r^ 


T K 

5. S. S. 
ox j.x. 


(<■ !) CIO m - (iiksixki)- 


TtCji. 


where 5. . = ^ t = 1, 2, 3, 

c=t ^ ^ 


/¥ 


and 


X/ 


W =.1,2, 3, 


The summation in (1) is to be taken over terms such that 

iL-fJjt KA +■ -t where J, J, K 

are ix)sitive integers, and where ^ is obtained from the 
sampling polynomial by replacing the exponents of the 

polynomial by corresponding subscripts. 

(2) , 



Loc* ' ci't 



148 SAMPLES FROM A FINITE POPULATION 

In particular ^ ^ 7 /^3 

^ - 7 ./^^ F fX 

= -fa, - / 5 ^-f 60 ^^- 

fi = ^ - 3f + /SO - 3‘^0 + /-Z^A 

^ = (e‘>Xj:i.^-ZIOO^^+--^3lp^^-Z’'>Zo^j'j^O,^^ 

where " /v-k ^ -^, 

It noust be kept in mind that the multylicatioii of these oper- 
ators is symbolic. Fdr exampk, to find if f first multiply the 
polynomials f C-p') and ^ ordinary multiplication and 

then the result when the exponents in this product are replaced 
by corresponding subscripts is ^ , 

Since in this paper it is desired to consider moments rather 
than power sums, replace - 5 ^ . ^ by (X^)/xljxA s^, J}y 
Then ( 1 ) becomes, after dividing by ^ , 


Now -A 


T Jl^R I+SrK 

M. S ? gg - ' « 




: C 

/»'-4 > 


Ci!) Of) ao • • X/ - Jh K! 

■A hence, 

A _ . . (jL-Ai-t) 

Siibstituting this value for each ^/^Ca. “ (A) the result is 

S <^oet)f 077 S 3 , 

X;z-='^A'x 

= i" F j 

^ 3:?6 '(r^HXAf-z) Oi-00y-.n.)yU.^,^ M.^. 


:x 



A. L. OTOOLE 


149 


^ 2 , 

•f 6/^ (M‘-i)(M-x)(n/’'^) -^^"x 

'^ ¥ A* Ol'-f) 0 "^3 ^:k. 

f- 3 u (jt-t) (/r-‘Jt)(A/'-Ji-p) ^ 

•f (/Y'^6>yi/Jh f-£i>M^t /V ) 

etc. 

Now let /!/- a.jt . Then 

iqs/qii4?'yis <o. 


r 




yv 


a>Ji- 




.M-. 


= 1 ! (y-Oj->)(Jl-x)Mi + 3 CL (Ji-i)(<^-l)Mt^ / 4 ;^ 

('eL^-iXaJL-J.)L ‘‘^ 


Av; 


y2' 


* (iZA,-0(aJi-xXaA-B) ^ ^ 

^ doc Oi-i)(<i-i)\-^(<i'*h^^ 


2- 
'X 


etc. 



150 


SAMPLES FROM A FINITE POPULATION 


A partial check at this point is to note that for a. 2 = 1 only the first 
term of each of these moment coefficients remains. 

Let 0.-2. Thee the above moment coefficients become 


r 








(7)i 


Zyz 


V; 2: 






I 


" L A-x ^ A. x] 

l-l ^ (a -7-) ^ ^ A: J 

f *I ^ / 

- '^.xj 






etc. 

It is observed that when zx = 2, i.e. when 77 - 2ji, the mo- 
ment coefficient is independent of the moment coefficient 

' Also is independent of . But one must 

not assume that all the odd moment coefficients of x are indepen- 
dent of the corresponding odd moment coefficients of TC , Fory^J.^ 
is not independent of as is seen by evaluating which 

is the coefficient of in the expression for . 

So far the moments considered have been the moments of x 
with respect to the origin from which x is measured and the mo- 
ments of 2* with respect to the origin from which ^ is measured. 
Consider now the moments of x- about the mean value of x and 
the moments of ss; about the mean value of ^ , Tliat is let 


It 





3 


c * 




/v". 



A. L. O'TOOLE 


151 


Then 

7 ^2 = since by (5). 

Jt\C JLxi 

= ^ Cy^-M^) = 2" 5r . 

e.g. z, = z, - /%. = + + 

= ( x^, - ) -f- A/^) + t^3- - • • +^{:x^- } 



Hence it is clear that zis the same function of ^ as z- is of x. In 
other words — (the moment of z- about the mean of at) — 

is the same function of -^3..^ —(the moments 

of X about the mean of x) — ^as ^'.^wasof^' m.' 

There is one important simplification however due to the fact that 
■^t:x " 0 hence all terms which involve vanish. With 

this in mind (6) becomes 





■A^;x 


(OJV f )(c(Jl--x) 

I M il- 




CaA-i)(aA-x)CaA-3) 

M. X too- (Q-f }C«-x)(jx-i) (ajt.-M.-i) ^ 

(aA~i }(ctJt-x)Co.yi~'3)(a^-‘f) s:x. 

(a-i)La-x)Ca^'A-ia.a.A- tiXA+Sa.) ^ 


etc. 

Here again it is noticed that for <2. a 2, i.e. for xf = 2Jl, 
is independent of . In other words the skewness of the 

distribution of 2 is independent of the skewness of the parent 



152 


SAMPLES FROM A FINITE POPULATION 


population of x . Similarly -^5.^ is independent^' of 
also independent of • But since ^ is not zero for a. 

is independent of * 

Now consider the variance of ^ ^ 




yj^CcL-t'^ 

Uj\ - ; 


r 7 C 


and 
= 2, 


r (since tI-o-Jl). 

Obviously it would be very desirable to have the variance (squared 
standard deviation) a minimum- Since the variance is a function 
of ^ differentiate with respect to^. 


d- j, _ . 

-jjl '^2:9. - /V-/ - 

To make yx., . _ a minimum ^ 

* A''- t *2;x 

or, that is, (X = 2. 


When ^ - 2, 


■ 






'a>x. 7 


0 and hence n ^ 2 A 


N 


<r = 

xfsrj 


r 


In conclusion it may be said that there would seem to be 
good reason to suggest that, when possible, the investigator ar- 
range to have twice as many variates in the control group or par- 
ent population as in each of the samples to be analyzed. Taking 
A - ^ will insure that the skewness of the samples will !>€ 
indepen<lent of the skewness of the parent population and also 
that the fifth moment of the samples will be independent of the 
fifth moment of the sampled population. In addition, taking 
will cause the variance (squared standard deviation) 
of the samples to be a minimum. Choosing M presumes, 
of course, that N is an even number. But in most instances it should 
be possible to arrange that A he even. For if an odd number of 
observations are given either another observation may be added or 
, /one of the given observations deleted, to make y even, 

« ^ vanishes with ^ because ^ E(,1~12 E). But E is not 

a'factor'<tf E . ' ■ ' '■■/■ • '• ■' 



EDITORIAL; H. C. CARVER 


PUNCHED CARD SYSTEMS AND STATISTICS 


Because of the increasingly important part being played by 
mechanical devices in statistical methodology, it seems desirable 
to cal! attention in the Annals to some of the possibilities of 
ptinched-card systems. 

The standard punched-card, illustrated below, is seven and 
one “half inches by three and one-quarter inches in size. To a 
certain extent the operation of a punched-card system is analogous 
to that of the Teletype machines used in wiring messages. In 
the latter case telegrams are written on a special typewriter which 
translates the message into a series of electrical impulses that 
in turn operate a distant typewriter which prints the message on 
a strip of paper; in the former, cards are automatically fed 
through a special typewriter that both prints the words or numbers 
on each card and also translates the information into properly 
punched holes in the card. The data on these cards may be 
totaled if desired by running the cards through a tabulating 


Fig. 1 


027S m £40 S4a 343 2Q2 


UCOftGE $ CM sa 

m 11 I 11 ; 

11 11 I I : 

1 1 1 1 ! j n m 1 1 1 rm H ! 1 1 1 n 1 1 limit T in 1 ill un tin ! n 1 M 1 1 ui It n f I n HH 11 1111 i 

anm 3 ' 33 | 39 nn 3933 U 393 n 3333 l 3333 l 33 nt 33 n|m 3393 iS |3 3 n 3 % 3 ^ 3 ll 3 SI 33333 «S 3 ll ^ 
4 4^444 444444 444444 44444 444 444444444 44|4 4444 441944444414 4 4 44441444 i44 4444444 4444 | 
3§3i5l593S$SS55i!i5St|95S$5 5'SS5S|S$§|SSSS3S$S3S3$S5SSSSSSI5SSSSSS.S5ISSS$SSISS3 3ll i 
I6§l|f He8 8 3«|f36Sr'68 3&Sie98l8lfi$86m|e8 9f|6 46ESS'l|93 4e'SS&Ufie6 8 6fi3i9iSSItl|SS&l ; 
nir 77 i 7 nnjnn 7 n?t 77 77 T 7 ' 77 i 7377 ri 7 n 7 7 7 7 7 777?7 777 377 U 777 nn 3 in 7 Jn 7 n.irif| 
l88S@888SS8'88at&83888888Se888a8S«8888888888i888l88a888l8 8gl8l8888838'||JS4 3f§l-88-| 
S3S88|f 3S®98i8tl89SS9l8388ifl89S«8lfl858 9S9.f8»5Sr9S8SI98lf3'SS8888StiSSWb,ft,*l8,lf i 

' .•'Itin.5P.fA " ' VK(fftSI8 f8» JSJC' IWS#T ! 



154 


PUNCHHD CARD SYSTEMS 


itiacliiiie at a rate exceeding niic per second; the total of l!ie 
s<|iirires of the numbers appearing ou consecutive cards may be 
obtained automatically; if the variates x and be punched in 
respective colitmiivS on each card, the total of the products for 
all the cards is likewise made available; the cards may hit arranged 
ill order of magnitude according' to card number, date, or variates 
at a rate exceeding six per second, and finally the data on tin* 
cards may lie printed on a scroll— the cards passing through the 
listing or printing machine at about 80 per minute. 

In order to provide an actual problem to serve as an illus* 
tratioip I secured the anthropometric records of one thousand 
first year male students who entered the University of Michigan 
in the fall of 1928. The 1,000 cards, of which Figure 1 is a sample, 
were punched by an operator of average ability in slightly Icsn 
than three hours. The data for the students were selected at 
i*andom and the cards, as punched, were numbered consecutively 
from 1 to 1,000, — the card selected for Figure 1 lieing the 275th. 
The weight to the nearest pound of this individual was 125 
pounds, and the height, width of shoulders, and the circumferences 
of chest, waist, hips and right thigh were, respectively, 64.0, 16.8, 
34,2, 26.0, 34.3 and 20.2,— linear measurements being made to 
the nearest one-tenth inch. 

These 1,C00 cards may now be placed in a talndaiimj inrd 
listing maclTmt which can total all seven of the data fields sitnul 
taneously. If desired, the imchine will also print all of the in fur' 
mation of each card on a scroll together with the t.Jtals. Tlie firsi 
and last parts of this scroll are reproduced l)eh.nv photogra|)hically, 
—the' names, of the individuals being omitted puri‘x)sely. because 
of both the number of columns involved and the magnittidcH of 
the totals, the listed totals unfortunately run, together. The vet- 
4'tical lines, inserted with a pen, facilitate the reading of tlie totals. 
b;The cards' are totaled 'and listed simultaneously at the art-': of 80 
per minute. 



H. C. CARVER 


ISS 


IS'8 

134 
130 
14 3 
16 0 

135 
14 S 
115 


681 

6‘74 

677 

asz 

698 

712 

683 

682 


165 

164 
177 

166 

165 
163 
171 
167 
176 
1S7 


33S 
343 
36 5 
350 

355 

356 
397 
365 
373 
328 


265 

265 
306 

266 
287 
372 
32 5 
280 
300 
248 


36S 

327 

3aa 

345 

346 

360 
303 
344 

361 
330 


195 
1^0 
338 

196 

195 
20© 
aaa 

196 
303 
1®& 


99 t 

992 

993 

994 

995 

996 

997 

998 

999 

10 oo 


125 

149 

130 

149 

128 

133 

130 

143 

141 

135 


658 

684 

664 

698 

630 

695 

667 

718 

680 

686 


166 

174 

174 


360 
340 
352 
555 
365 
327 
330 
343 
370 
34 5 


253 
204 
281 
28 5 
280 
267 
278 
27 5 
300 
230 


354 

370 

345 

3S3 

345 

370 

360 

365 


193 

311 

190 

312 
205 

191 
200 
201 
213 
306 


9^X66 33'^3S3341381510 


139S88 6788 9dx6633'n3S334 1^381510135 5 165301096 


Fig. 2 

An investigation of the correlation that may exist between 
height and weight will involve the numerical value of 


tCiOO 



where x^* and designate the height and weight, respectively, 
of the individual. The plugboard of an Auto^natic Mid- 
tiplying Punch may be wired in a few seconds so that 

(a) the data of columns 34, 35 and 36 of Figure 1 will 
feed into the multiplier of the punch, 

(b) the' data of columns 38, 39 and 40 will feed into the 
multiplicand, and then 

(c) the product, , for any card run through the machine 
will appear on the product sumnmry counter. As the 
cards pass through the machine, the total of the prod- 
ucts is accumulated on this counter. 

If desired, each product may be punched automatically in the 
card, provided of course the card contains a sufficiently large 
number of otherwise vacant columns. The maximum number of 
digits in current models that may occur in either multiplier or 
multiplicand is eight. The number ,of digits in the multiplicand 
does not affect the, speed of the .multiplication; for three or less 
digits in the' multiplier the cards feed^ thr,ough the machine at' 
the rate of three seconds 'per' card,— for eight digits in the^'inul-* 



I5f> PUNCHHD CARD SVSTEiVIS 

tiplier the spe.ed is twelve cards per niiimte. One iiiJW therefore 
place our cards in the machine, press a button, resume other 
duties, and some fifty minutes later the 1,000 cards will have 
yielded the total 

To olitain the sum oi the squares of the variates in question 
it is necessary only to double-wire the machine, — one wire g’oing 
to the multiplier and the other t(3 the niuItij)licancL We obtain 
then 

5 : ^ ai 5 n 692 H5Z . 

By peroiittiiig the machine to punch each value of x. in the 
card, we may treat x""as the multiplicand and x, as the mul- 
tiplier and then obtain the sum of the cubes of the variates ; or 
by double-wiring x^obtain the sum of the fourth powers of the 
variates. If, while accumulating the cubes of the variates, we let 
the machine also punch each cube in the card, we may then ol)tain 
the sum of the powers of the variates up to and including that 
of the sixth order, etc. We are limited, of course, by the fact 
that the card contains eighty columns. 

By running the punched cards through a sorting machine, 
we may obtain very readily the frequency distribution of the 
weights, and also the corresponding median, quartiles, etc. To 
accomplish this the cards must be run through a sorting machine 
three times, first sorting to column 35 of Figure 1, then to column 
34 and finally to column 33. The cards pass through tlie sortiiig 
machine at the rate of 400 per minute, so that in approximately 
eight minutes— including time spent in handling the cards 1:)etween 
■sorts; 'these TOGO cards will be perfectly arrangedi according to 
magnitude in weight. If the numbers with ,resi>ect to which the 
sort is to, be made contain '/v digits, the cards must be run 
througlx thC' sorter' 'A/ times. We reproduce on the following page 
a' "photograph ,of the ''first' part of 'a' printed 'scroll obtained by 
running the. cards th'tO'Ugh , the listing, machine ' .after , they had 
been sorted., according . jo .w.eiglht;'' 



H, a CARVER 


157 






(ajm 

ll'ly/jf 


Sheafcj^f^ 

Che-0- 

Wdisf' 

Af'r 

/rt 

2 3 2 

89 

59? 

140 

295 

253 

305 

171 

14 6 

10 0 

60 ? 

15 7 

30 0 

242 

312 

170 

69 1 

101 

690 

15 0 

301 

245 

308 

165 

3 5 8 

102 

676 

IS 4 

32 7 

262 

322 

174 

5 55 

102 

662 

153 

303 

242 

307 

165 

94 1 

102 

6 4 0 

154 

313 

262 

313 

168 

2 0 9 

10 3 

663 

IS 4 

310 

285 

313 

172 

80 1 

10 3 

64 9 

15 0 

302 

247 

317 

168 

1 4 

104 

636 

147 

313 

253 

300 

178 

S 1 3 

10 3 

635 

14 8 

310 

250 

328 

1S7 

7S 0 

10 5 

637 

155 

330 

258 

319 

178 

5 6 3 

106 

648 

152 

302 

24 5 

320 

169 

672 

10 6 

638 

162 

32 3 

254 

320 

181 

15 3 

10'? 

623 

143 

317 

250 

322 

172 

7 S 

108 

630 

160 

335 

245 

332 

183 

a 3 5 

10 8 

669 

157 

32 0 

247 

324 

172 

3 2 2 

1 08 

624 

165 

30 5 

260 

325 

188 

SOS 

108 

637 

152 

330 

246 

330 

183 

3 1 

109 

620 

161 

334 

265 

340 

192 

39 3 

109 

62 5 

152 

3.1 4 

233 

336 

187 

3 0 

110 

631 

162 

34 5 

265 

327 

180 

16 0 

110 

66? 

153 

320 

247 

322 

1?'5 

1 8 S 

110 

627 

15 2 

323 

274 

320 

172 

631 

11 0 

630 

1S3 

33 5 

255 

330 

ISO 

8 0 2 

110 

623 

149 

3X7 

257 

3 2-8 

181 

1 5 

111 

650 

15 8 

312 

265 

344 

194 

15 1 

11 1 

637 

157 

322 

258 

325 

1T2 

2 7 3 

111 

SSI 

156 

. 332 

270 

326 

180 

44 7 

ill 

655 

160 

314 

252 

324 

176 

2 0 

112 

696 

155 

30 0 

2 30 

335 

174 

426 

11 2 

647 

154 

333 

262 

3 28 

170 

SO? 

112 

691 

161 

323 

262 

322 

180 

71 6 

112 

667 

148 

330 

254 

322 

185 

83 1 

11 2 

651 

147 

373 

262 

333 

182 

3 0 8 

113 

661 

155 

318 

2S0 

330 

180 

383 

11 3 

665 

154 

338 

240 

333 

178 

4 4 9' 

113 

651 

151 

32 0 

258 

311 

163 

541 

113 

67 S 

152 

316 

257 

340 

186 

59 1 

113 

660 

16 6 

34 1 

265 

333 

180 

82 6 

113 

654 

147 

331 

260' 

353 

183 

89 8 

11 3 

67? 

156 

32 0 

253 

335 

187 

93 3 

113 

656 

153 

315 

240 

326 

179 

94 7 

113 

6 37 

146 

316 

262 

357 

188 

96 7 

113 

696 

151 

32 0 

2 58 

350 

169 

14 8 

114 

654 

15 3 

320 

247 

322 

183 

1 7T 

114 

6 33 

160 

342 

25S 

321 

172 

25 7 

114 

653 

162 

318 

270 

320 

185 

36 8 

114 

679 

145 

318 

243 

322 

173 

4 6 2 

114 

6S3 

153 

32 2 

261 

321 

171 

46 4 

114 

62? 

15 0 

310 

262 

352 

180 

54 5 

114 

612 

157 

33 4 

260 

318 

181 

74 1 

11 4 

662 

161 

330 

262 

338 

186 

951 

114 

681 

153 

311 

242 

333 

174 

987 

114 

667 

152 

34 2 

250 

5 3'0 

188 

1.0' 

115 

645 

157 

328 

248 

330 

180 

139 

115 

66S 

146 

33 2 

265 

339 

188 

XS9 

115 

636 

160 

32 8 

297 

318 

174 

33 6 

115 

663 

153 

• 321 

252 

383 

167 

743 

11 5 

'639 

158 

32 2 

278 

338 

182 

949 

11 5 

686 

154 

' 530 

242 

330 

182 

97 0 

115 

653 

152 

313 

2 SO 

3 37 

172 

98 8 

115 

692 

155 

330 

252 

352 

172 

sa 

11 6 

640 

157 

316 

2 60 

337 

193 

18 6 

116 

606 

159 

3 30 

258 

337 

186 

226 

116 

665 

166 

341 

262 

335 

185 

a6 8 

116 

636 

16S 

‘3 5 0 

26 5 

350, 

192 

347 

116 

650 

163 

321 

265 

32S 

195 

sto, 

115 

662 

162 

3 37 

275 

330 

,190 

523 

116 

69S 

158 

325 

■ 24 3 

34 0' ■ 

182 

926 

116 

641 

154 

338 

257 

340 

182 



158 


PUNCHED CARD SYSTEMS 


A rough notion of the fiinclional de|)encleiice that exists 
between weight and the other six variables recorded on tlie cards 
may be obtained by permitting the machine to total these ordered- 
with-respect-to- weight cards in consecutive groups of 100. That 
is, we o1>tain the averages for numeincally equal groups selected 
according to the weight-deciles. The six regression lines may 
therefore be plotted, approximately, from the following results: 


TABLE L 

Anthropometric Averages Based on Weight Deciles. 


Inter-decile 

Range 

Weight 

Height 



Waist 

Hips 

RtTk 

First 

■B 

65d33 

15.576 


25.748 

32.968 

18.133 

Second 

mSSm 

66.659 

15.980 

33.663 

26.777 

33.927 

18.926 

Third 


67.087 

16.161 

34.421 

26.978 

34.387 

19.206 

Fourth 


67.381 

16.334 

34.816 

27.622 

34.858 

19.554 

Fifth 



16.406 

34.860 

27.954 

35.081 

19,893 

Sixth 


68.189 

16.651 


28.065 

35.511 

20.112 

Seventh 


68.576 

16.789 

35.766 

28.513 

36.006 

20.438 

Eighth 


68.895 

16,807 

36.116 

28.780 

36.420 

20.712 

Ninth 


6<).18S 

17.022 

36.788 


37.181 

21.444 

Tenth 


69.854 

17.611 

38.596 


38.826 

22.678 


If we had arranged the cards numerically with respect to 
fieight, instead of weight, we would have obtained the following 
results : 


TABLE 2. 


ANTHROPOkETRlC AVERAGES BaSEB ON HEIGHT DeCILES 


Inter-^decile 

Range 

Weight 

Height 

Shader 

Chest . 


H ips 

RLTk 

First 

123.95 



34.201 

27.371 

34.217 

19.592 

Second 

130.97 


16.261 

34.856 

27.984 

34.944 

19.929 

Third 

133.75 

66.367 

16.282 

34.787 

27.731 

35.152 

19.926 

Fourth 

136.28 

67.021 

16.570 

35.429 

28.329 

35,302 

20.084 

Fifth 

139.81 

67.623 

16.587 

35.379 

28.206 

35.499 

20.223 

Sixth 

140.60 

68.189 

16.532 

35.510 

28.184 ; 

35.560 

20.008 

Seventh 

142.65 

■jfiaiTii 

16.659 

35.550 

28.380 1 

35.821 

20.315 

Eighth 

143.44 

69.498 

16.638 


^ 28-065 : 

35.858 

20.205 

Ninth 

145.71 


16.776 

■ 35.778 

'28.236 ' 

; 35.960 ^ 

20.0')S 

,, Tenth 

; 155,72 

; 72.346 , 

16,996 

36.423 . 

29.024 

36.852 

20.740 
















THE METHOD OF PATH COEFFICIENTS 

By 


Sewall Wright 

Department of Zoology, The University of Chicago. 

Introduction 

The method of path coefficients was suggested a number of 
years ago (Wright 1918, more fully 1920, 1921), as a flexible 
means of relating the correlation coefficients between variables 
in a multiple system to the functional relations among them. The 
method has been applied in quite a variety of cases. It .seems 
desirable now to make a restatement of the theory and to review 
the types of application, especially as there has been a certain 
amount of misunderstanding both of purpose and of procedure. 

Basic Forntidae 

The object of investigation is a system of variable quantities, 
arranged in a typically branching sequential order representative 
of some chosen point of view toward the functional relations. 
Such a system is conveniently represented in a diagram such as 
Fig. L Those variables which are treated as 'dependent are con- 
nected with those of which they are con- 
sidered functions by arrows. The system 
of factors back of each variable may be 
made formally complete by the introduc- 
tion of .S 3 TObols representative of total 
residual determination' (as 14 in Fig. 1), 

A residual correlation between variables 
is represented by a double-headed arrow. 

It will be assumed that' all relations are 
linear.^' v Thus .each; variable "is' related to' those from which nm- 

' ^ Relations ' which are ' far Trom'' linear; with' , respect to the abs6liite 
values of the variables may be' approximately. linear with respect to varia- 
tions, if the coefficients of variability are stnfll Thus if 



Fig. I 



162 


THE METHOD OF PATH COEFFICIENTS 


directional arrows are drawn to it by an equation of the follow-” 
ing type, where ) t ' represent deviations 

from the means and ^ etc. are the coefficients, 

( 1 ) (v,-Z)= 

It is convenient to measure the deviadon of each variable by 
its standard deviation. Let ^ X - etc. 

and let 


X - t K -tP y. - X, 


The coefficients in this form are of the type called path coeffi- 
cients. Each obviously measures the fraction of the standard 
deviation of the dependent variable (with the appropriate sign) 
for which the designated factor is directly responsible, in the 
sense of the fraction which would be found if this factor varies 
to the same extent as in the observed data while all others (in- 
eluding residual factors j are constant. This definition (ex- 
cept for determination of sign) can be written as follows, putting 
the constant factors after a dot. 



It is sometimes convenient to represent the standard deviation due 
directly to a particular factor by a symbol. The form ^ C 

will be used. Obviously 
and, neglecting sign. 


O/ 


Z. 3 - ' ' ^ 


^ A 3* ' ' 

The theorem which makes the path coefficient useful in relat- 


ing correlations to functional relation is a very simple one. The 
correlation ■ between and any other variable in ' such a 

system as Fig. 2 can be written in the form 


the. relation of small deviations from the mean values are approximately 


the 'first .order terms of an expansion by, Taylor^s Theorem. The error 
: be represented' 'by. a residual term , 



g vf 

: - I', 


TL 





SEWALL WRIGHT 


163 






2 S ■^, 


7^ Jz, ■f' 


Ctv '^t 


The correlation is thus analyzed into contribu- y 
tions from all of the paths in the diagram 
(Fig. 2) passing through each, factor of one \ /f ^ 
of the variables. \ | 

But the correlation terms symbolized by 
A^i may be capable of analysis by application ^ 

of this same formula. By repeated analysis of Fig. 2 

this sort, as far as the diagram (such as Fig. 1) permits, we are 
led to the following'" principle: Any correlation between variables 
in a network of sequential relations can be analyzed into con- 
tributions from all of the paths (direct or through common fac- 
tors) by which the two variables are connected, such that the 
value of each contribution is the product of the coefficients per- 
taining to the elementary paths. If residual correlations are 
present (represented by bidirectional arrows) one (but never 
more than one) of the coefficients thus multiplied together to give 
the contribution of a connecting path, may be a correlation coeffi- 
cient. The others are ail path coefficients. 

In tracing connecting paths it is obvious that one may trace 
back along the arrows and then forward as well as directly from 
one variable to the other (perhaps through intervening variables) 
but never forward and then back. That two factors affect the 
same dependent variable does not contribute to the correlation 
between them. Similarly two variables which are correlated with ^ 
a third are not necessarily .correlated with each other. As illus- 
trations of these principles consider the correlations between some 
of the variables; in Fig. 1. ' 


W , 3 <&, ' 




It is soiWetinaes coHyenient to use an extension of the syrtibol- 
ism in dealing with compound paths. 



164 


THE METHOD OF PATH .COEFFICIENTS 


oz 




: 


tf ^Pjl'F 4- 4-P 

^ ^01 B oiifB «»5 

In this symbolism all of the variables along a contributing 
path are listed in proper order. If the path passes through a 
represented common factor, the latter -is indicated by a dot. If it 
involves an unanalyzed correlation 'the two ultimate correlated 
variables may be indicated' by a line as above. • The evaluation of 
such compound path coefficients is obvious, 

p.^ppp 

0( Us 7 O/ Us ZS 

V,r 5 -C,Tf^ , etc. 

It is to be noted that the symbolism does not apply to the 
indicated variables in an absolute sense but is always to be under- 
stood as relative to a particular arrangement of the variables, 
i.e. to a particular point of view with respect to the functional 
relations. 

A special case of equation (5) arises if one correlates a varia- 
ble with itself, taking into account all factors (known and un- 
known) 

(6) :2: =/. 

This may be put in a form which is usually more convenient 
by further analysis of 




( 7 ) 


00 


z 


F. T- yz.. . 

'ot 'oj, CJ. 


f. 


Degree of Determination 

From' the' formula 7^^ - it is obvious that a squared 

path coefficient measures the portion of the variance of the de- 
pendent' variable for which the independent variable is directly 
responsible, under the' point of view adopted. The squared path 
coefficient may accordingly be called a coefficient of determina-' 
'tiop. ;Spch coefficients were used before the term path coefficient 
was applied to the square root. (Wright 1918.) 



SEWALL WRIGHT 


165 


The sum of the squared path coefficients is unity only in the 
case ill which there are no correlations among the factors. It is 
necessary^ therefore, to recognize additional terms measuring the 
changes in •variance (positive or negative) due to correlated oc- 
currence of the contributions of such factors ( ^ 
etc. ill equation 7). It is tempting to apportion determination 
among the factors by using the terms ^ of equation (6) as 

measures of determination, and this has been done by some au- 
thors, e.g, Kirchevsky (1927) who independently reached a some- 
what similar viewpoint on the interpretation of systems of cor- 
related variables in other respects. No transparent meaning can 
be attached to such expressions (which may be negative). The 
term does not measure direct determination since it involves in- 
direct connections between the variables. Neither does it meas- 
ure total determination, direct and indirect. This is given by the 
squared correlation coefficient. 


The Correlation between Linear Functions 
The most direct application of the method is in the estimation 
of the correlation between two variables which are functions (in 
part at least) of the same variables. Let 14 stnd \4* be two 
variables whose correlation is desired. 






Vi i — 

sz. a 

+ <^sL 


. 


■^T ■^Tr 




/• 

c 

(8) 

= 

2 CT 

-h 




<r. . 


2 2 -C^. C . 

■ Ti. TJ. 

<r. (f? M.,’ 


T- = 

Si. j 

} 

■p- = 

Vt TL. 

o;- 


(9) 

Jh ' 


p. 

'n. 

+■2 P- 

SI. uj. 

P. 




Fig. 3 


As an example, suppose that we wish to find the correlation 
between the compound variables 4 ^3id where 
and Vj, * Kf, knowing that ^ and are all of 

equal variability , ^ } 



166 


THE METHOD OF PATH COEFFICIENTS 


and are independent 


^ = 
TIt = 




!Z 
3 (f" 


At = , 

l-i 'V 


T -^T -'F 

'sz 's3 T! 


F 

7 " 2 . 


2.3 Xi/ 3^ 


T--fr^ 

ry ^ 




*^:sr ^ 


T -t 7^ 7^ 

‘T/ ^ <J5 2. 


“ ^/3 


Again suppose that we wish to estimate the true correlation 
between two variables from that between measurements known 
to be subject to considerable random error. Assume that the cor- 
relation between two measurements of the same variate has been 
found in each case. It is instructive to work this out from two 
different points of view. 

Let be the mean of rrv ‘measurements 
^ ^ mean of 

•TV measurements of y3 - 

The known correlations are those between 
measures of /4 between measures of 

3 between measures of /4 and B 

- Expressing the complete determina- 
tion of A and B by their components, using 
equation (6) : ^ ^ = 

P- 

^/3B 

From these 



Fig. 4 


/SB 






E .j/ 


The correlation between 4 and B can be written 






JL ^ ^ 

A3, 


77a* 7Z 




T ■ 

'/3-B 


At 


'AO 


( 10 ) 



TtZ'-tv 


OfiTw-Oaz^J [i ^-C-n-Ooi^gs] 

'For indefinitely large values of ttz. and tx the, averages may 
be considered as true scores, A^and 

(11) 


- 4 - 


'A0 


' ^00 

y This result can be reached much more directly by the simpler 
; set' up ';(Fig.; S)' in which the observed measurements are repre- 
';sehted, ''ais 'll^d:ions^'^pf' true; scores /\ .^_'and of random 

T ' ' T 



SEW'ALL WRIGHT 


167 


errors^ Note that the directions of the arrows are the reverse of 


those in Fig. 4. 




giving 



7- 

giving 



AB 7^ A 


P 


again giving 

/L 

£3 “ /— 





'AB 




mm 


Fic. 5 


This formula is, of course, Spearman's correction for atten- 
uation. The purpose here is to bring out the simple way in which 
such formulae can be obtained by the method of path coefficients. 
The following is a more complex case in which a simple method 
is more essential. 

The StaHstical Effects of Inbreeding^ 

Assume for simplicity that the ef- 
fects of different genetic factors com- 
bine additively (no dominance or 
epistasis). In Fig. 6, P and repre- 
sent the genetic constitution of two 
parents and O of their offspring. The 
constitution of the latter (under the 
above assumption and ignoring the possibility of sex linkage) is 
equally and completely determined by the constitutions of the two 
germ cells ^ ^ ) which united to produce it. It will be con- 
venient to represent the path coefficients and corrdations by 
single letters: 

- - 



Fig. 6 




/Y = 




F 


The determination of O by &, and can be ejcpressed in 


in preseittitig this ,aiid later cxafnplcs: is, to ’ illustrate 
sometbiug of the range of' applicability.. of the, method^ rather;,, than ' to 
a nualysls of 'each .case. 'Foir the latter the reader, .must 'be- .referred, 

to th^ relcrmces cited at the,' 



168 


THE METHOD OF PATH COEFFICIENTS 



tlie equation 3 cl (/-t h } = / (by equation 7) giving 

“-ViW • 

Two complefneritairy germ cells 
7, such as could arise 
from the same reduction division have 
the same relation to the genetic con- 
stitution of the parent (which they 
completely determine in a mathematical sense) as the two germ 
cells which united to produce the parent, assuming no selection 
and that different series of allelomorphs are combined at random, 
{an assumption compatible with linkage among the genes and 
with inbreeding, but not with assortative mating). Using primes 
for the path coefficients and correlations of the preceding gen- 
erations* 


Fig. 7 


\f- 


!+ F' 


(13) ^ = c,' 

Since (by equation 12, applied to the preceding 

generation) ^ irrespective of correlation between the 

parents under the assumed conditions. 

( 14 ) 

The correlation between uniting gametes is directly related 
to the percentage of heterozygosis. Below is the correlation table 
between unitiiig gametes in a population in which genes A ®^d 
lau are present in the frequoides of and #- ^ respectively and 
the proportiem of heterozygotes is ^ * By the usual formula for 
corrdatiem ^ v 

( 15 ) * 


AH of the padi coefBcients and correla- 
, bem expressed in terms 

of . Various applications; can be 
node. As a ; simile ^ case consider the 




A JtsW 

A 

St* 


t , 

€SU 






HEZ 



SEWALL WRIGHT 


169 


effects of continued brother-sister mating : Analyzing the correla- 
tion (M ) Fig. 8, between the parents by tracing the connecting 
paths : 

(16) /i = z ■ 

Expressing ail coefficients in terms of 
Fs and reducing 

(17) f"J 



(18) 


z 


--f- 


Fig. 8 


Thus the percentage of heterozygosis (to which the effects 
of inbreeding are directly related) is very simply related to the 
percentages in the two preceding generations. If there were ini- 
tially 50% i.e. yi heterozygosis, that of later generations would 
be given by the terms of a series of fractions in which each 
numerator is the sum of the two preceding numerators (Fibo- 
nacci series) if the denominator is doubled in each generation. 
This rale was derived empirically by Jennings (1916) on work- 
ing out in detail the consequences' of every possible mating, 
generation after generation. .The analysis by path coefficients 
(Wright 1921b) not only demonstrates the generality of the em- 
pirical rale but can be applied as easily to more complicated cases 
in which the analysis by types of mating would, be practically 
impossible. Consider, for example, the more genera! case of a 
population restricted to mature males and mature females 
(Wright 1931b). Under random mating, the chance of a mating 

of full brother and sister is ^ , of half brother and sister 

and of less closely related individuals „ 




The correlation between mating individuals is thus 


A/rrj* ^ 


(19) vV'T 


Z-hZ/i, 






(- 






wMch ;yields on .'reduction 




r Afyn * 




170 


THE METHOD OF PATH COEFFICIENTS 


Equating to — „ gives + as the approximate rate 

of reduction of heterozygosis per generation. The special case in 
which the population is equally divided between males and females 
C/i/^ ^ ^ ) gives ^ as the rate of reduction, a figure recently 

verified by R. A. Fisher by a very different mode of analysis. 
The method has also been applied in the much more compli- 
cated case of assortative mating based on somatic resemblance 
(Wright 1921b). 

In the case of the irregular' inbreeding encountered in live 
stock pedigrees (Wright 1922, 1923a), the basic formula of path 
coefficients leads immediately to the formula 

( 22 ) /- = 2. i(-k) ■ 


where and are the number of generations from sire and 
dam respectively to the common ancestor (/I) at the head of 
each connecting path. By appropriate sampling methods (Wright 
& McPhee, 1925) this formula can be used in the study of whole 
breeds. Closely allied is the formula for the genetic correlation 
between any two individuals (x^y) ^ Letting at and be the 
generations from X Y respectively to the common ancestor 


of any connecting path 
( 23 ) ^xY 


^ Lew • 

\J 


These formulae have been extensively applied in breed analy- 
sis (Wright 1923b-c, McPhee & Wright 1925, 1926, Smith 1926 , 
Calder 1927 , Lush 1932). 


Multiple Regression 

The preceding applications have consisted in the main in the 
deduction of correlation coefficients from knowledge of the func- 
tional relations. ^ The method can be applied as well to the inverse 
problem, that of finding the best linear expression for one varia- 
ble ' in terms -of a number of others, from'' knowledge of -the cor- 



SEW ALL WRIGHT 


171 


relation coefficients* No assumptions are made 
with respect to causal relations. Analysis of the 
correlations between \4 the other variables 
(Fig. 9), by the basic formula, gives the follow- 
ing set of equations. 

JL ~ t - 4 - '' -h P 








- F Ji 


tn. ^ W ‘ 


Obviously these are merely the normal equations of, the meth- 
od of least squares in a slightly disguised form, as might be 
expected from the derivation of the basic formula.' The solution 
for the path coefficients, expressed in terms of determinants, 
merely need to be multiplied by the proper ratio of' standard 
deviations to give Pearson's formulae for the partial regression 
coefficients. The method of path coefficients here merely furnishes 
a convenient mnemonic rule for writing the normal equations. 

The correlation between the actual values of % and the 
estimates ( ) (Fig. 10) from each set of values of the other 

variables, (given by the regression equation) is Pear- 
son's coefficient of multiple correlation. Let 
stand for the array of residual factors of Vo ffi 
a form independent of the known factors. We may ' Mit 
write an equation of complete determination (Fig. 9) Iq 


■'2. jt 

, O'C- <H 

n- ^ 

.S. F. ^ 




P, r Jl 


(Fig. 10) 


Therefore 






It is unnecessary to give illustrations of the use of the uietiW 
in obtaining ordinary estimation or prediction equations. 



172 


THE METHOD OF PATH COEFFICIENTS 


A somewhat different type of application has been made in 
estimating the transmitting capacity of dairy sires (Wright 1932a). 
In this case the necessary correlations were deduced from Men- 
delian theory checked by observed correlations between the sire's 
female relatives and his daughters. These correlations were then 
used to calculate the multiple regression of sire on daughters and 
their dams. 

Partial Correlation 

It is sometimes of interest to find the values which statistics 
would take, on the average, in data selected for constancy of one 
or more variables* 


(26) 




72, “* 


= cr 


P cr ( I 


X- 

cCiZ-'-n.) 




a well known formula. Inspection of equations (3) and (4) and 
of the definition of gives the following for the standard 

deviation of due directly to , under constancy of etc,, 
and for the related path coefficient and concrete partial regres- 
sion coefficient under the same conditions. 


(27) 


• m 

oO) 

- • rn ^ 

oO) 


i- 

i(Z’ 

-77fJ 

(28) 

K, , 


Cco. 


^\i 

E 





-m 

C;. 





■ w 

(29) 


' * m - 


• z — ^ ^ 






• ;st 

As might be expected, the concrete coefficients of the .multiple 
regression equation etc,) remain the same (on the average) 
in samples selected for constancy of one or more of the factors, 
while' the abstract path coefficients are altered in value in such 
samples. 

The formula for partial correlation can be derived from the 
formula cr^ =; Ci 

as applied to ' the data in which particular variables j are 

constant* 



SEW ALL WRIGHT 


173 


(30) 


(T 

o- I z •' m 


O* 2 




o/*2 ”W 


= / 


) 

^ oJ ■ Z '-TTl J 

/ 


= /■ 


’o 


• 77 ?) 




o . 


This derivation leaves the sign uncertain but this is easily 
determined from a different approach. In Fig, 11, includes 
all factors of other than the desig- 
nated independent variable and the 
variables iC? which are to be held 

constant. \/^ represents the residual 
factor for , in relation to the factors 
value of 7^^ in this sys- 
tem is not of course the same as in 
the preceding discussion in which other variables than 
(those to be made constant) were treated as factors of . 



Since 

0 / ct 


j -^/.z •• 

777 <3/* 

But 

^£>1 . Z ' 

' • m 

- -- 

P ^ 

( 

(31) 

A,,. i 

t * 777 

II 

^ • Z • • 777 

JThus 


‘ ' 777 

has the same sign as 




. m 




a , '• 777 

Therefore 




This is simply formula 28 except that it is in a set up in which all 
factors of except constant, in which case 

becomes -m • 

Since > ~ Kaz 

2 , 


/ 

/ - ' 


P 

' ou. 


^ /V 


' 777) 

3- 

/ — 

<7 02 * ' • 777 ^ 

letting \C represent the combination of VC above 

formulae for partial correlation can be written in a number of 
very compact forms. 


^u. 



174 


THE METHOD OF PATH COEFFICIENTS 


( 32 ) 

( 33 ) 

( 34 ) 




of . 




l i 


0 Li^ 

V 

•p^ 

‘ 0\A/ 

T, 

■ ^ 

4.7=^ p"- 
\( CU. '01 'iv 

’of 

T 




The first of these is identical with 30, • 

Symbolism 

The most widely current symbol for a partial regression co- 
efficient is Yule^s expression Kelley (1923) uses a 

similar expression ^ for the coefficients in abstract form. 
These have an advantage over the symbols used here (-^^z ^ 
respectively) in that they define certain absolute functions of the 
variables, while the latter symbols have meaning only in relation 
to a particular arrangement. This relativity of meaning can not, 
however, cause confusion as long as one is dealing with only a 
single system. If the problem is of a more complex sort than the 
calculation of a prediction formula, the /3 symbolism becomes 
too cumbersome for convenience. The current symbolism has the 
further disadvantage of a certain lack of logical consistency. In 
the expression subscripts to the right 

of the 'dot are understood to represent factors held constant. In 
the expressions If we wish 

to represent the multiple correlation of X? with X 
independent of X^, or the beta (path coefficient) for the influ- 
^ence of on in data involving also and but in 
which X 5 ' is held constant, it would apparently be necessary 
under the usual symbolism to write such ambiguous expressions 
5 2 , 3*3 ■ ^^spectively. Pearson^s method 



SEIF ALL WRIGHT 


175 


of writing constant factors as subscripts to the left of the main 
symbol avoids these difficulties and is the one which I have fol- 
lowed in earlier papers. The dot symbolism has, however, be- 
come so firmly established in the cases of the standard deviation 
and correlation coefficient that it is probably best to recognize it 
as the general device for indicating constant factors and to replace 
it in those symbols in which it is used for a different purpose. 

There is no difficulty in the case of multiple correlation. The 
expression may be used for the correlation of X© with 

and X2, jointly and the expression is an unambig- 

uous symbol for the multiple correlation independent of X3* 

As noted above, it is not desirable in the usual application of 
path coefficients to encumber the symbols with a list of the factors 
of which each dependent variable is treated as a function. This 
can be left to a diagram. Where a complete formal symbolism 
is desirable, the list of factors might follow a semicolon instead 
of a dot. Thus ^^..^33 would unambiguously represent the path 
coefficient relating to in a system in which Xo is treated 
as a function of , X^j and X3 but in which is to be held 
constant. There is, however, little need for such complicated 
■ expressions. 

Qmntitative Evaluation of Causal Relations 
While the method of path coefficients is directly applicable to 
such problems as the estimation of correlation coefficients from 
knowledge of the mathematical relations between variables, or the 
converse (multiple regression) it was' developed primarily as a 
means of combining the quantitative information' given by a sys- 
tem of correlation coefficients with such information as may be at 
hand with regard to the causal relations, and thus of making 
quantitative an interpretation which would otherwise' be,, merely 
qualitative. 

How far such causal analysis 'has meaning is a question, on 
which there is difference of opinion. Some' authors (Person, 
Niles)', have contended that, the ■ 'designation of' the. ' relation !»+ 



176 


THE METHOD OF PATH COEFFICIENTS 


tween two variables as one of cause and effect involves a false 
conception; that we can merely observe more or less perfect 
correlation. This view seems to imply that direction in time is 
of no significance, and indeed G. N. Lewis has recently argued 
for the complete symmetry of the physicist’s time. The common 
sense view that direction in time is a basic perception is not witli- 
otit support, however. 

Under the theory of relativity, the elementary physical reality 
seems to be the point event located at a particular position in the 
space and time of a particular viewpoint. The objective world 
is to be thought of as a complex network of point events. Al- 
though two such events sufficiently remote from each other in 
space, relative to their separation in time, may have their order 
of succession in time reversed in the systems of two dfferent 
observers, order in time is invariant along any strand of this 
network involving continuity of physical action. Thus the succes- 
sion of collisions suffered by a particular body or by a beam of 
light is the same to all observers. Such successions of events as 
involved in the movement of a shadow over a surface may indeed 
be reversed by change of viewpoint, if the shadow happens to be 
moving more rapidly than the velocity of light, but the continuity 
of physical action here is not along the path of the shadow but 
traces separately to each point in this path from the points of 
interception of the light. There is frequently difficulty in com- 
plex cases in distinguishing lines of direct causation from correla- 
tions due to common causation but in principle the distinction is 
clear enough. Experimental intervention is possible only in the 
true lines of causation. 

In the world of large scale events, certain patterns tend to 
recur. Certain recurrent successions of events come to be recog- 
nized, experimentally or otherwise, as lines of causation ' in the 
above sense. Different lines of this character may come together 
in a certain -type of event or may diverge from one. In many 
cases a fairly adequate representation of the course ^ of nature can 



SEW ALL WRIGHT 


177 


be obtained by viewing it as a coarse network in which the 
''events” of interest are the deviations in the values of certain 
measurable quantities. A qualitative scheme depends on observa- 
tion of sequences and experimental intervention. It is of interest 
to make such a scheme at least roughly quantitative in the sense 
of evaluating the relative importance of action along different 
paths. This was the primary purpose of the method of path 
coefficients. 

Birth Weight of Guinea Pigs 

The simplest application of this sort has been in connection 
with the factors which determine the weight of guinea pigs at 
birth (Wright 1921a). Minot (1891) noted that the average 
birth weight is smaller, the greater the size of the litter. He 
reasoned that this might be due either to a competition between, 
the developing foetuses, or merely to an effect of a large litter 
in stimulating somewhat premature birth. In confirmation of the 
latter hypothesis he found that the gestation period was several 
days shorter in large litters than in small ones and that there was 
in fact a direct relation between length of gestation period and 
birth weight. After some discussion, he concluded that the data 
afforded no evidence of growth competition and thus he decided 
in favor of the second hypothesis. I was able to confirm Minof s 
observations, obtaining the following data in a large stock of 
guinea pigs. The mean birth weight (in grams) of the animals 
in the litter is the birth weight used. The interval between litters, 
where less than 75 days is approximately the' gestation period. 
Standard, errors are given. 



Mean 

S D 


B (Birth weight) 

SS.Z^ta.51 

lg.^0 t i.3b 

t. 553 t , 020 

31 

X (Interval) 

&.8.<)3to.e5 

f. ?/ t 0.0‘t 

JL^=-.&58t .CIO 
BL 

L (Size of litter) 

S.9/± a.o^/ 

a.03 

-.‘137t .oZ% 


^ The correlation between birth weight and size of litter was 
based on 3353 cases, the other two correlations on' 1317 cases. ' 


order to'' make a comparison ^ of Minofs' two 'alternatiyes^ 



178 


THE METHOD OF PATH COEFFICIENTS 


these may be represented graphically in a single diagram. 

Birth weight jS is completely determined (in the mathemati- 
cal sense rather than causally) by the prenatal growth curve and 
the age at which growth is interrupted 
by -birth (G) ^ It is assumed that the 
rate of growth (/^) immediately before 
birth is a sufficient index of the growth 
function and that the rate of growth is 
uniform at this time to a sufficient degree 
of approximation. In substituting ges- 
tation period for interval a small correc- 
tion is desirable. On grounds which need not be gone into here, 
it is estimated that the correlation between interval and true ges- 
tation period is about .95. No correction is necessary for birth 
weight since there is little or no growth In the first day after 
birth. The correlations involving interval must be divided by ,95 
to obtain estimates of those involving gestation period. 

t. while ~ is unchanged, 

Minot's problem resolves mathematically the analysis of the 
observed correlation between birth weight and size of litter into 
the sum of two ■ composite path coefficients representing the two 
postulated paths of influence. 

The method furnishes at once four equations for determining 
the values of the four path coefficients. One of these expresses 
the complete determination oi B by /? and G . The others are 
the expressions for the three known correlations. 


( 35 ) 

P 

' BR 


= / 

( 36 ) 

P 



( 37 ) 

P ■ 

^ OR 



( 38 ) 






Fig. 12 



SEW ALL WRIGHT 


179 


These are not all linear equations, a condition which generally 
distinguishes this sort of applicaton of the method from the cal- 
culation of partial regression coefficients. In the present case, 
however, there is no difficulty in the solution. 

^G-i ° —-15^ '^BL~ 

The result is an analysis of the correlation between birth 
weight and size of litter into two components whose magnitudes 
indicate that size of litter has more than three times as much 
linear effect on birth weight through the mediation of its effect 
on growth as through its effect on the length of the gestation 
period, contrary to the results of Minot’s verbal analysis. 

In this case, the answer to Minot’s question might have been 
obtained from a set up mathematically identical with that used 
in multiple regression (after correcting the correlations with inter- 
val to obtain estimates of those with true gestation period.) 

By equation 24, 


( 39 ) 

A - 

6 L 

BL 

B& i 

( 40 ) 

^ 0 & 

: P ' A, t 
61 GL 

P 

^BCr ' 


GL 


Lc 


-.51 


The term F . 

uL 




.5/ can be interpreted as 
measuring the influence of size of litter on birth Fig. 13 
weight in all other ways than through gestation period. In' other 
cases, however, proper causal analysis may require a set up utterly 
different from that used in obtaining the best estimation equation. 
There is no routine method of making the proper diagram in the 
former case. This seems to have occasioned more misunderstand- 
ing than anything else among those who have attempted to apply 
the method. One author in a critique of' the method, took, the 
form' of diagram intended to represent the sequential' relations in , 
the case of guinea pig ,weight,and arranged some, variables rel#-'' 
ing to 'basal metabolism in man in the ' same scheme in: an arbitrary' 



ISO 


THE METHOD OF PATH COEFFICIENTS 


way and then complained of the meaningless and absurd results 
which he obtained ! 

Transpiration of Plants 

The contrast between the kind of set up appropriate to an 
estimation equation and that for evaluation of a causal interpreta- 
tion m^as illustrated early (i921a) in connection with a study of 
the data of' Briggs and Shantz on transpiration in plants. The 
reader is referred to the paper for the details, but it may be 
appropriate here to compare the different diagrams used. The 
authors obtained the total daily transpiration of a number of 
plants. The environmental factors studied were total solar radia- 
tion (R ) , wind velocity (Wj, air temperature in the shade CT) , rate 
of evaporation from a shallow tank (E) , and wet bulb depression, 
sheltered from sun, but not wind (3) * To avoid seasonal effects, 
the logarithms of ratios for successive days were used instead 
of absolute values. 

An estimation equation for wet bulb depression was obtained 
in terms of wind velocity, solar radiation and temperature (Fig. 
14). 

It was pointed out that for causal anatysis, 
radiation should be omitted as not affecting 
wet bulb depression in the shade, while a fac- 
tor not directly measured, absolute humidity Fig. 14 

(H) should be included. There should be complete determination 
of B by W , T and H . As so arranged, there are two more 
unknown coefficients than known ones. It was assumed that there 
was no correlation between absolute humidity and wind velocity. 
The necessary additional equation was obtained from the theo- 
retical multiple regression equation relating, to W ^ T and // , 
by ' substituting the ^ extreme differences in wet bulb depression, 
temperature and wind velocity ■ of the average daily cycle and 
assuming the absence of any such cycle in absolute humidity. 
'PoS'Sibly' this was , not .wholly Justified in this case. If so, no 
numerical evaluation of the 'chosen point of view could be made. 




SEIVALL WRIGHT 


181 


Even in such cases, the attempt at analysis by path coefficients 
may be valuable in locating deficiencies in the data already co!- 
lected and' suggesting the kinds of new data which should be 
obtained* 

The final set up used in relating transpiration 7^ and evap™ 
oration from a tank to wet bulb depression and^ the chosen en- 
vironmental factors is given in figure 15 with the values of the 
path coefficients and correlations. De- 
terminations were made for 10 varie- 3 

ties of plants. These gave fairly con- 
sistent results which are averaged in 
Fig. 15 although there were certain 
interesting differences. There was a 
marked difference between the tran- 
spiration of the plants and the rate 
of evaporation from the tank' in the 
relative importance of the various 
factors. ,, Tig. IS 

The Relatwe Imporfance of Heredity and Enzdronmcnt 

Among the most satisfactory applications to causal relations 
are to problems of genetic determination. The development of 
an organism is the product of the confluence and interaction of 
two -distinct streams of causation, heredity and environment. The 
interaction between the hereditary influences emanating from the 
nuclei of the cells of the organism and the influences coming 
from outside these cells, but largely from other parts of the body, 
where' they in turn are the products of heredity and cell environ- 
ment and so on back to the one' cell . stage are complex, enough, 
but if we go back of this to the ultimate factors: the array , of 
genes assembled at fertilization and the environmental conditions, 
external to the organism, the sequential relations are for the most 
j>art clear. The problem is that of 'determining the relative im- 
portance of differences in heredity and of differences i'n enviton- 
rneiit ,in determining differences in the, characteristics o'f,''individ-''. 




182 


THE METHOD OF PATH COEFFICIENTS 


iials ill a given population. The principal complications are the 
possibilities of nonlinearity in the combination effects of different 
genes^ of different environmental factors, and of heredity and 
environmental factors in relation to each other. 

We will review a case, the amount of white in the coat pat- 
tern of certain strains of guinea pigs, in which such combination 
effects appear to have been of negligible importance (Wright 
1920, 1926b). A stock of guinea pigs was maintained for many 
years by the U. S. Bureau of Animal Industry without outcross, 
but with the avoidance of even second cousin mating. The cor- 
relation between mated individuals (143 pairs) was 
indicating that mating actually was at random in respect to coat 
pattern. The correlaton between parent and offspring averaged 
■t.il I t . 0 ^^ s with no significant differences in relation to sex. 

By the theory developed on page 0^7^ Allowing for 

incomplete determination by heredity this becomes 
(41) . ./f = = i 

Thus .3$ leaving .62 for determination by environment. 
The correlation between litter mates averaged . 

In the case of litter mates it is necessary to distinguish two 


groups of environmental factors — ones 
common to litter mates (B) and ones 
peculiar to individuals (H ) . From the 
diagram 

(42) 

where ^ 0 ) determination 

by common environment. Its value is 
.09, leaving ci^-.53 as the determina- 
tion by nongenetic factors not common 



Fig. 16 


to litter mates. It seems rather surprising that the environment 


common to litter mates should determine so little in a character 


^ These standard errors, obtained from values in different subdivisions 
of the data are larger than would be obtained from the 3881 parent-off spring 
pairs, which however necessarily involve much repetition of individuals. 



SEIFALL WRIGHT 


183 


graded at birth, but only very minor effects of this sort have been 
discovered experimentally, the most important (contributing; ,036 
to the correlation of litter mates) being an effect of the age of 
the mother. The high degree of asymmetry of the pattern in incli- 
vidiia! animals is in harmony with a large element of chance 
(somatic mutation?) in the determination of pigmented areas. 

The above estimates ( .38 ^ . 53 ^ .oj) are estimates 

of the portion of the variance due to heredity, non-geiietic factors 
peculiar to individuals, and common environment, respectively. 
They are the portion of the variance which should be eliminated 
by control of each factor. It is not possible to control the rather 
intangible environmental factors but hereditary variation can be 
eliminated by close inbreeding (decrease of heterozygosis being 
about 19% per generation under brother-sister mating). It hap- 
pened that a number of piebald stocks were on hand, each -de- 
scended from a single mating after several generations of in- 
breeding. These differed markedly in average percentage of white 
in the coat, although individuals of each varied widely about their 
family averages. Crosses between strains at opposite extremes 
gave intermediate offspring, justifying the assumption of no 
dominance. The family (No. 35) most advanced in inbreeding 
was descended from a single mating in the 12th generation of 
brother-sister mating, but even in it there was variation from 
nearly solid color to solid white. As expected by theory, very 
little, if any, of this variability was hereditary. The correlation 
between parent ^ and off storing was only t: 0 Z¥T ^^020 . The cor- 
relation between litter mates was 103 1 .oz5 , again indicating^ 
only a small amount of influence of environment common to, litter 
mates. 

The standard deviation,, measured on an appropriate scale* 

^ On a percentage scale of measurement, necessarily limited at 0% and' 
100%, a given factor has more effect near the middle of the range than 
near the limits. The appropriate transformation of the scale X , ranging 
from 0 to 1 is xT/%_^'Vx-5*o)pvhere fnf is the inverse prQbgbility function*,, 

' ' 'V''- 

the direct function being defined in the form X “ J 
iWright V 



184 


THE METHOD OF PATH COEFFICIENTS 


came out *574 (about 22% of the area of coat in the neighbor- 
hood of 50%t). In the random bred stock the standard deviation 

was 0.782 (about 28%). The variance of the stock in which 

z- 

57V ] 


hereditary variation had been eliminated was thus 54%} = 


of that of the random bred stock. This agrees as well as could 
be expected with the estimate of 62% of the variance of the latter 
as nongenetic, based on the parent offspring correlation, although 
not as well as an earlier estimate made when the numbers were 
smaller (nongenetic variance 58% as deduced for a parent-off- 
spring correlation of in random stock, variance of inbred 

family 57% of that of random stocky 4^. )• 

C<ise of Human Intelligence 

Another illustration of the difference between a quantitative 
interpretation and a multiple regression formula has been given 
(Wright 1931a) using data of Miss B. S. Burks, on the roles of 
heredity and environment in determining human intelligence. 
These data consisted of intelligence tests of 104 California chil- 
dren, tests of their parents and in addition grades of home en- 
vironment. Similar data were obtained of 206 children adopted 
at an average age of 3 months, and of their foster parents and 
home environments. The correlations as used were corrected by 
Miss Burks for attenuation. 

If the purpose is to obtain the best estimation for children in 
terms of their parents and environments, the variables are to be 
related as in figure 17 in which C is child’s intelligence, P is 
midparent and £ is the measure of home environment. 

Normal Equations Children 




(Own) 

(Adopted) 

( 43 ) 


= r.b! 


( 44 ) 

S- 

11 

■h .x^r 

(45) 


If 


SolutioESt 


=• + .72 

-•<37 Fig. 17 


nF 

= -.13 

7- .35 



SEIVALL WRIGHT 


185 


The solutions of the normal equations in the two bodies of 
data give what at first sight appear to be contradictory results. 
There is no apparent reason why environment should not play as 
great a role in shaping intelligence in one case as in the other, 
yet it turns out that while the partial ’regression of child’s IQ on 
home environment is significantly positive in the foster data, ii is 
negative as far as it goes, in the case of own children. 

The point that is sometimes overlooked is that the arrange- 
ment for obtaining the best possible prediction equation does not 
necessarily yield coefficients which have any simple interpretation. 
This is obviously the case here. If child’s IQ is affected both by 
heredity and environment, the same is presumably true of paren- 
tal IQ. In so far as the latter is determined by environment it 
is not a causal factor in relation to 'child’s heredity. A diagram 
intended to represent causal relation must represent parental IQ 
as merely correlated (two headed arrows) with child’s heredity 
and child’s environment. Another complication which must be 
represented is the correlation of heredity with environment. Good 
heredity in a family will tend to create a good environment and 
vice versa. The simplest possible inferpretatwe diagram for own 
children is thus of the type of figure 18. That for foster children 


is given in figure 19. 

Even these are doubtless too simple since 
heredity is represented as the only factor apart 
from the measured environment. Any estimates 
of the* im|x>rtance of hereditary variation will 
thus be maximum. 

The two correlations given by Miss Burks 
in the case of the foster data' (Fig. 19) 

yield the value = -f-.7 7 

for the correlation between home environment 



and midparental IQ. The actual correlation was : 

not published for the foster data, but there is no reason why it 



186 


THE METHOD OF PATH COEFFICIENTS 


should differ significantly from that in the other data in which 
the ¥altie was f*, . There is reasonable agreement. 

In the case of own children, three correlations of interest 
here, were piibiished, = -f". V? ^ - t.&f ^ ^ » 

But this is not enough to give a solution for the coefficients of 
the 5 indicated paths. The assumption of complete determination 
of C by H and E gives a fourth equation, still an inadequate 
number. No solution is possible, a situation which as previously 
noted, very frequently arises in such analysis, even when one 
makes the most simplified possible qualitative representation of 
the causal relations. A great deal of utterly unwarranted verba! 
interpretation of correlation coefficients would be avoided if the 
authors took the trouble to represent their ideas in diagrammatic 
form and noted whether or not the number of equations possible 
from the data (kiiowm correlation coefficients and known cases of 
complete determination) was as great as the number of paths in 
this diagram. 

In the present case, another equation can be obtained by bor- 
rowing from the foster data* Environment should make approx- 
imately the same contribution to IQ in both, groups of children. 
The concrete partial regression coefficients 

Q/. and 6^. f } 

should thus be approximately the same in the foster as in the own 
children. Assuming that 6^ ^ are the same in both cases, 

P 

the ratio .from the foster data may be accepted for the 

fcfi 

group of own children. The five equations now available are as 

follows : 




Equations 

Solution 

(46) 

■^BP 


* t 

(47) 

^CB 


= +- -27 

(48) 




(49) 

p 

i-. 30Z 

^ 

(50) 

•*CB ' 

'P^ B zF - F ■ Jt = / 

CH * c* fiW >£ 




SEWALL WRIGHT 


187 


The solution assi^is reasonable values in all cases and shows 
that there was no real disagreement involved in the relation of 
the two groups of children to their environments. 

It was noted that this analysis gives a maximum estimation 
of the role of heredity. An attempt was made to obtain a min- 
imum estimate compatible with acceptance of the observed cor- 
relations, by carrying the analysis back a generation and assuming 
as much similarity in the determining factors of successive gen- 
erations as the data permit. 

Such analysis re€|uires ' separate treatment of heredity (H) as 
a factor of development, and heredity or genotype (G-) as the 
linear system of gene effects which best approximates > the former. 
Departures from linearity in the effects of allelomorphs (dom- 
inance) and in the effects of nonalleloniorphs (epistasis) are 
common. Moreover there may be non-linearity in the combina- 
tion effects of heredity and environment. Thus a certain genetic 
complex in the guinea pig (c^c^BB) produces more melanin pig- 
ment at low temperatures than does a certain other but 

less at high temperatures (Wright 1927). The subject is too in- 
volved for “detailed discussion here but it may" be noted that in 
general correlations between deviations due to dominance and 
epistasis must be taken account of. 

In tfie 'case of Miss Burk’s data, there is no possible way of 
distinguishing the effects of environmental factors not included 
in the measuranent of home environment from the contributions 


of dominance and epistasis or from non-linearity in the combina- 
tion effects of heredity and environment. In the attempt at ob- 
taining a minimum estimate of heredity, these three very diverse 
facto|s were' put together m a miscellaneous group M . '*rhe dia- 


gram' of, relation' used is 
given , in Fig. 20 . Child’s 
genotype (G) is represent- 
od as ' partially ' determined 
,by midparental genotype 
, the residual variabil- 
'ity being that y of'' 



Fxo, 2o: 




18S 


THE METHOD OF PATH COEFFICIENTS 


lian segregation. Child's environment is treated as in part deter- 
mined directly by midparental intelligence P and in part tracing 
to the environment of the preceding generation ^ ^ . 

The path coefficient relating genotype of midparent to that of 
child could be estimated, assuming Mendeiian heredity and taking 
into account a correlation of +-.7^? between father and mother. 
It turned out to be mathematically impossible to assign the same 
values to the path coefficients of the parental generation as in the 
offspring generation, but this is not surprising since the parents 
were tested as adults instead of young children. The solution for 
the parent generation was to some extent indeterminate but within 
rather narrow limits, on making what seemed the most reasonable 
assumptions. The values reached are given in figure 20. The 
path coefficient for influence of hereditary variation lies between 
the limits 4:71 (if dominance and epistasis are lacking) and + .90. 

Analysis of Sise Factors 

The first published application of the method was to the in- 
terpretation of a system of correlations of bone measurements 
(length and breadth of skull, lengths of humerus, femur and 
tibia) in a population of rabbits (Wright 1918). The 10 ob- 
served correlations were accounted for primarily as due to a 
single genera! factor (not necessarily acting proportionately on 
the 5 variates). The residuals which appeared were attributed to 
group factors. 

In a recent paper (1932b) the same figures, two other sets 
of figures for rabbit populations ( and of a wide cross) 
and figures from a flock of hens have been analyzed by a some- 
what improved method. A set of re variables yields cor- 

relation coefficients and hence the same number of observation 
equations of the type > where A and 3 are two 

of the variables and ' G is , the ■ general factor and it is assumed 
for the moment that the correlations are due solely to differences 
in general size. The residuals are minimized by the method of 
least squares. 



SEW ALL WRIGHT 


189 


This method necessarily gives residuals which are as likely to 
be negative as positive. The interpretation is more satisfactory 
if the path coefficients relating each measurement to the general 
factor, are all reduced by the proportion necessary to eliminate 
significant negative residuals. It happened that in each of the 
'4 sets of data studied, the most important negative residuals were 
those between the skull and hind leg measurements, and the 
method followed was to eliminate the average of these. 

The important positive resid- 
uals in all cases indicated natu- 
ral group factors — a head group, 
a general leg group, a foreleg 
group (in the one case in which 
both humerus and ulna were 
measured) and a hind leg group. 

Other indications such as a 
slightly closer relation between 
head and iT,, releg than between 
head and hind leg, slightly closer 
relation between proximal leg 
bones (humerus and femur) 
than between non-homologous 
and hind leg bones (humerus and tibia) were less certain. Figure 
21 shows the system of path coefficients arrived at in the case' 
of the fowl measurements. The squares of these give the degree 
of determination in each case by the general factor, the group 
factors and special factors. 

The Use of Partial Correlation in Interpretation 

Partial correlation coefficients have sometimes been used in 
tlie^attempt to interpret systems of correlated variables apparently 
on the theory that the , reduction or elimination of a correlation; 
between two variables on holding a third constant demonstrates, 
the latter to be causally responsible -for the correlation., The 




190 


THE METHOD OF PATH COEFFICIENTS 


method at first sight seems analogous to that of the experimen- 
talist ill attempting to control all sources of variation except those 
in which he is interested. This, however, is a delusion in the case 
of correlation (as opposed to regression) coefficients (Wright 
1921a) and the method of path coefficients was developed because 
of the unsatisfactory nature of interpretation based on partial 
correlation. As R. A. Fisher (1925) has stated, “In no case, 
however, can we judge whether or not it is profitable to eliminate 
a certain variable unless we know or are willing to assume a 
qualitative scheme of causation.” 

This point can be illustrated by considering a system of 3 
variables, /f , 3 and C in which the following correlations have 
been found. 

By substitution in the usual formula, . This is 

compatible with the interpretation, represented in figure 22, that 
3 is an intermediary in a single chain of causation connecting 
C and A . 


Another interpretation is that 3 is the 
only common factor 


.ro 




^ ^ p> . P 


'^AS ^BC. 


.^5 


But it is also possible that ^ may be the 
product of the interaction of two correlated 
factors A and C 






b O 




Fig. 22 



Finally /f , 3^ and d may be correlated 
with, each' other through' reciprocal interac- 
tions, or through complexes of unknown common factors, making 
im^possible anything beyond the mere descriptive use of the cor- 
relation coefficients, or the calculation of estimation equations. 



SEW ALL WRIGHT 


191 


The first step in the application of the method of path 
coefficients is to bring clearly into the open the system 
of functional relations among the variables which 
seems significant for purposes of interpretation. In 
the majority of cases, verbal interpretations which 
seem reasonable enough as long as the basic postulates 
are kept discretely in the subconscious mind become obviously 
crude and inadequate when ^expressed in a diagram. Occasionally, 
however, statistical systems are capable of some interpretation. 

Difficulties in Causal Afialysis 

There are a great many systems of correlated variables for 
which no interpretation can be suggested in terms of sequential 
relations. Among these are cases in which there is prevailingly 
mutual interaction between the variables instead of action in one 
direction. The branches of science differ considerably in the type 
of relation which predominates. 

As already noted the developmental process of organisms is 
essentially a one way process, and the ultimate factors of devel- 
opment, heredity and environment act on it without being acted 
upon. A method of analysis which takes account of the sequential 
relations is thus imperatively called for in genetics, 

A case in which such analysis would not be possible may be 
illustrated by the relations among the various properties of the 
blood, as discussed by L; J. Henderson. The physiological mech- 
anisms are such that alteration of any one brings about immediate 
readjustments in the values of the others. What one wishes to 
determine are the functional relations, whether in the form of 
equations or of nomograms. If such a system were studied- by 
correlational methods the best that could be done would be to 
attempt' to approximate the functional relations by multiple, re- 
gression (linear or ctirvulinear as the case required). 

There is iisiialiy rapid reciprocal action among the variables 
0:f interest ^ to the economist or sociologist and the correlations 



Fig. 25 



192 


THE METHOD OF PATH COEFFICIENTS 


among the simultaneous deviations cannot, in most cases, be 
treated as due to lines of one way causation among these varia- 
bles themselves* Thus the price of a commodity cannot properly 
be treated as caused by the amount marketed or vice versa. The 
exception is where one variable is clearly external to the social 
system in question as is the influence of weather on crop yield. 

There is more likelihood of being able to represent the various 
simultaneous deviations as direct consequences of the system of 
deviations of the preceding year (together with the clearly exter- 
nal contemporary factors) but even here, a causal diagram can 
be set up only after a most careful consideration of the realities 
of the case. There may be lags of greater duration than one 
year and a correlation between two variables in successive years 
may trace to more remote common factors rather than to a direct 
line of causation from the earlier to the later. 

Corn and Hog Correlations 

These points were illustrated by a study of corn and hog cor- 
relations (Wright 1924). An attempt was made to analyze the 
play of interacting factors responsible for the annual fluctuations 
from the general trends in production and price of hogs during 
the relatively undisturbed period between the Civil War and the 
World War. It was shown that variation in the corn crop and 
certain interrelations among the hog variables themselves deter- 
mined from 75 to 85% of the variance of the latter. The annual 
fluctuations about the trend during the period of years from 1871 
to 1915 inclusive (so far as data were available) were found for 
com acreage, yield, crop and price and for western and eastern 
wholesale hog packs and for farm price of hogs. The fluctuations 
were found separately for the summer and winter seasons for 
western wholesale . hog pack and the corresponding live weight, 
pork production {product of preceding) and .price. Correlation 
coefficients were found not,, only for the same year but between 
variables separated by. one, two and often three years. .Altogether 
510' correlation coefficients. .were calculated.,. 



SEWALL WRIGHT 


m 


Most of these coefficients could be given reasonable enough 
verbal interpretations, but there was no assurance that the 
vious” interpretation in one case, was compatible with an equally 
‘"obvious” interpretation in another. The problem was to repre- 
sent ail of these verbal interpretations in a single diagram and 
determine path coefficients which would account simultaneously 
for the entire system of correlation coefficients. With 510 cor- 
relation coefficients and 4 cases of complete determination, one 
could write 514 simultaneous equations to determine the values 
of whatever system of path coefficients had been used. Theoret- 
ically one could introduce the same number of different paths 
into the diagram. It would not be practicable, however, to deal 
with such a large number of unknown quantities and even if prac- 
ticable, the complexity of the system would defeat the purpose 
of the analysis. The problem thus resolved into the discovery of 
a simple system of relations which would give a reasonably close 
approximation to all of the correlation coefficients. 

It has been emphasized that the method of path coefficients 
is not intended to accomplish the impossible task of deducing 
causal relations from the values of the correlation coefficients. It 
is intended to combine the quantitative information given by the 
correlations with such a qualitative information as may be at 
hand on causal relations to give a quantitative interpretation. The 
analysis of cases such as the present and that preceding (size 
factors), in which the equations far outnumber the coefficients 
to be determined, may appear to be exceptions to this statement, 
but even here only such paths are tried which are appropriate in 
direction in time and which can be given a rational interpretation. 

Considerable experimentation was necessary before a simple 
system could be found which gave even moderately satisfactory 
results. The procedure followed was to list the highest five cor- 
relations’ of each variable with a prefeding variable. ' It turned 
out that the com variables were' so nearly independent of condi- 
tions in preceding years' that 'they might be treated practically as 



194 


THE METHOD OF PATH COEFFICIENTS 


independent in relation to the hog situation. The variations in 
corn crop depended largely on variations in yield md 

secondarily on variations in acreage . Corn price 

showed a correlation of .with the crop. 

Among the hog variables, the maximum correlations were with 
those which indicated most directly the amount of breeding (av- 
erage summer weight (sw) of the same year, winter pack (w f) 
a year and a half later, between which there was a correlation of 
i-.7S) r and with the preceding prices of corn and of hogs. The 
four variables: breeding , summer (s) and winter (w) price 
of hogs and price of corn CT) were thus chosen as a central sys- 
tem. 36 equations could be written involving these (using jointly 
the two indicators of breeding). Values of 13 path coefficients 
were tested by repeated trial and error until it seemed that no 
change (of the order of .05) would give improvement. The 
system reached is showm in figure 26 in which primes refer to 
preceding years. 

The other variables were then appended to this system, also 
by the triaj and error method. Corn crop was used in place of 
corn price, however. The results are shown in figures 27 and 28. 
These bring out the very different characteristics of the summer 
pack fsp} (consisting of a very heterogenous lot of hogs) and the 
winter pack, largely consisting of the spring pig crop. Aver- 
age summer and winter live weights are represented by (SwJ and 
6ww) in figure 27. 

The general conclusions were that the dominating features of 
the hog situation are the corn crop and its price, and an innate 
tendency to fall into a, cycle of successive overproduction and 
underproduction, two years from one extreme to the other, de- 
pending mainly on two compound paths: 


TO 

s'" 


The 32 indicated path coefficients together with 10 others relating 
total annual western pack to its components, eastern pack to 
westeim pack and prices, and farm price to packer’s price, ae- 



SEWALL WRIGHT 


195 


counted for the 510 observed correlation coefficients with an 
average error of only ,09 neglecting sign. The most serious dis- 
crepancies were in certain correlations involving corn acreage and 
yield which were intentionally ignored for the sake of avoiding 
complexity in the relations of the more important variables. 



The Elasticities of Supply and Demand 
In the preceding illustration, market supplies, prices, etc, were 
related to preceding conditions largely by a trial and error process 
of finding the system which' would work best and without' much 


196 


THE METHOD OF PATH COEFFICIENTS 


regard for ttieofetical considerations. In the following more 
theoretical approach I have collaborated with Dr. P. G. Wright. 
The purpose is to interpret observed series of prices and quanti- 
ties marketed as functions of two hypothetical variables^ the con- 
ditions of supply and demand. Only a brief reference has previ- 
ously been published (P. G. Wright 1928). 

The demand for a given commodity and given market is 
treated as that function of all economic factors (prices^ wages, 
etc.) which ' determines the quantity which would be purchased 
under any set of postulated conditions. The supply function, 
similarly, is treated as that function of all economic factors 
(prices, manufacturing costs, weather, etc.) which determines the 
quantity which would be offered for sale under any set of postu- 
lated conditions. The actual values which these functions take at 
a given moment tend to be the same, the price of the commodity 
itself being the immediate factor which shifts to such a value as 
to make them identical. 

We shall deal with the annual percentage deviations in quan™ 
tities and prices, whether from the preceding year or from the 
estimated trend of a series of years, instead of absolute values. 
The relative merits of these two procedures need not be gone into. 

Let X represent, values on a scale of percentage change in 
quantity and Y values on a scale of percentage change, in the price 
of the commodity in question. Let , 2^, etc. represent other 
economic factors of demand or supply or both on whatever scales 
are most suitable. The demand and supply functions themselves 
as percentage deviations in quantities under ,postulated conditions 
may be represented by and' X5 respectively.^ 

(SI) - /i C X, i,, 2^,' ■ ) 

= 4 ('i:, 2.,-, . 

^ ® If the absolute qtiantities are represented by U and the absolute 
prices by; V , and y - ■ It is customary to define the 

demand, and, 'supply functions Jn 'terms of the absolute' values, but for the 
■present 'purpose it ■'is ,,ipore: conyenient to define them- ''in relation to the 
; percentage deviation''',, ^ 



SEWALL WRIGHT 


197 


Assume that these functions are of such a nature that the 
de¥iations in price can be separated linearly from the other fac- 
tors to a sufficient degree of approximation. This does not imply 
lack of correlation between price and the others. 

(53) X^ = 'ZY+I? where X) = H, , j 

(54) Xs=eY^5 where 5 ^ £ (f 2,, ^ . 

The demand function is here analyzed into two variable com- 
ponents, a multiple of the price deviation yJ and the deviation 
(x>j in the quantity which would be purchased if there were no 
price deviation (Y-o) . The supply function is similarly analyzed 
into a different multiple of the price deviation (e Tj and the devia- 
tion (s) in the quantity which would be offered for sale in the 
absence of a price deviation. Thus D and 5 measure the strength 
of demand and supply apart from price and will be spoken of as 
measures of demand and supply. 

For given values of P and 5 
the equations define two straight 
lines which describe the momen- 
tary demand and supply situa- 
tions? respectively (Fig.29) . Their 
slopes relative to the Y -axis are 
given by ri and e respectively. 

These slopes are in accordance 
with the customary definitions of 
the elasticities of demand and 
supply, recalling that X and Y 

According to the usual theory, the actual quantity which 
changes hands and the actual price are determined by the point 
of intersection of the supply and demand curves. Under the ap- 
proximations previously assumed, and assuming constancy of the 
elasticities, but variation of D and 5 , the percentage deviations 

® The ratio where V and V are absolute quantities and 

Y 

t>rkts respectively. The ratio ^ is the elasticity of supply if , s = o , 
and is the elasticity of demand if 3>-o. 

Y ' Y'. ''y' 



Fig. 29 ^ 

are percentage deviations.® 



198 


THE METHOD OF PATH COEFFICIENTS 


ill quantity (d) and in price (f) are linear ' functions of J) and S . 
Their values may be represented as determined by multiple re- 
gression equations. It will be convenient to use single letters for 
the path coefficients. 


(55) 

r- 


— D 


(56) 


i. 

SiD 

- 5 


The elasticity of supply may be obtained from the ratio of ^ 
to P under a fixed average supply situation ( 5 - ) but varia- 

ble demand. 


(57) 


e = 








Similarly^ elasticity of demand is given by the ratio of dj to 
7^ when D equals zero. 

(58) n. = 


% 


'i^2, ^ 

Since the standard deviations are obtainable directly from the 
data it is merely necessary to find the values of the path coeffi- 
cients in order to calculate the two elasticities. 

A diagram can be set up as in Fig. 30 indicating primarily that 
P and Q are different linear functions of J? and 5 * Three equa- 
tions can be written at once; two indicating complete determina- 
tion of P and Q by and 3 , and one representing the correla- 
tion" between F and <3. « 


(S9) 


-h 2 ^ = / 

(60) 

%. + 




Unfortunately these three .equations involve 
5 unknowns. Other data must be brought to bear 
oil' the problem before any solution is possible. 
The diagram, suggests two possible sources of 



Fig. 30 



SEW ALL WRIGHT 


199 


additional data. If any measurable quantity can be found 
which is correlated with the demand situation but which can 
safely be assumed to be independent of the supply situation^ 
^ o ) we can write two new equations representing the 
correlations and respectively at the expense of only 

one additional unknown have now 5 equations 

and 6 unknowns. If it can safely be assumed that there is no 
■ correlation between the demand and supply situations o) , a 
solution is possible. If such an assumption with regard to 
does not seem justified, it may be possible to find a quantity id) 
correlated wnth the supply situation (as measured by 5 but of 
such a nature that no correlation with the demand situation need 
be postulated. The correlation ^^pand make possible two 
more equations, with only one more unknown bringing 

the number of equations and unknowns both up to 7. The path 
coefficients and hence the elasticities are now determinate. The 
additional equations are as follows: 


( 62 ) 


( 64 ) 

^BP - ^ 

( 63 ) 

- tl 

( 65 ) 

^ • 


The hog and corn data referred to in the preceding section 
were not obtained with the present purpose in mind, but may 
furnish rough illustrations of the method. The total weight of 
hogs, marketed at the principal markets in the summer season 
(March to October) 1889-1914, and the reported price may be 
considered first. Absolute ' instead of percentage deviations from 
trend were used but the correlations should not be affected much 
and coefficients of variation may be used in place of the standard 
deviations on a percentage scale. The most important single fac- 
tor affecting the summer hog pack was shown to be the corn crop 
of the preceding year. It is assumed that it is a factor of type Q , 
correlated with the supply situation as measured by S but not 
with the demand for pork as measured by -!> . It is further as-, 



200 


THE METHOD OF PATH COEFFICIENTS 


Slimed that there was no correlation between the supply and 
demand situations. 

Data 


Coefficient of variation — -Price 



6~p = i5.g& 

Quantity 



- lo.Bf 

Correlation— Price with quantity 




Correlation — Hog price with preceding 

corn 

crop 

- - •‘>'7 

Correlation-Weight of pack with preceding 

corn 


crop 




Equations 

Solution 



7 ^ ’ 

6 PC 

e =/-./33 

N- - / 

A ‘ 


-7^- ?/-/ 

. <^3 


f'NSZ 


= - .V7 




^5 = 

s - 




The solution indie xtes very little elasticity of supply (€s-^./ 53 ) 
but a very considerabL elasticity of demand 

Similar data were i^iven for 'the winter weight of pack (1870- 
1914). The largest correlation with a factor of preceding years 
was with average summer live weight of hogs, one and one half 
years before. This factor (an index of amount of breeding) is 
again assumed to be' related to the supply but not to the demand 
■situation, and again it is assumed that the supply and demand 


e - f” . ' 


situations vary independently of each other. 
Daia Solution 


* 

1^.59 

-A ’ 

f- . 


1 a. 75 

A. - 

- .755 


- .68 

> - 

■i- .los 

'^P0'. ' 

- . 63 


■h .91‘f 



5 ^ 

+ .^35 



SBWALL WRIGHT 


201 


The results are remarkably close to those of the quantity and 
price of summer pork. The low elasticities of supply are to be 
expected of an agricultural commodity the quantity of which is 
largely determined in advance and by factors independent of the 
market demand and which once produced must largely be 
marketed. 

I am indebted to my colleague, Professor Henry Schultz, for 
dafa on the quantity and price of potatoes marketed annually 
from 1896 to 1914 and the suggestion that it would be interesting 
material for analysis by this method. Trends had been fitted by 
Professor Schultz and trend ratios of quantity and price obtained. 

Data 

Standard deviation of price ratios 
Standard deviation of quantity ratios 
Correlations 

Price — quantity (same year) 

Price — quantity (preceding year) 

Price — ^price (preceding year) 

Quantity — quantity (preceding year) 

Quantity — ^price (preceding year) 

It is, assumed again that’ there is no correlation between supply 
and demands situations { -») and 
that the price (as a trend ratio) is a 
factor of type JS affecting . the sup- 
ply of the following year but without 
influence on the demand of the follow- 
ing year. The solution is as follows: 

s 

Figure. 31 gives a graphical representation of the relation. 

Again the virtual absence ;of ' elasticity of supply might per*' 
^ haps have been anticipated., ; The, siz^ of crop is, largely ' determined 



-e.45/ 



r 

J95 

(T 


t 30 

a. 







^,570 

^pf> 



JL > 


- .5Z% 

Q.H 






202 


THE METHOD OF PATH COEFFICIENTS 


before the price is known and the crop must be disposed of re- 
gardless of price. It is to be noted, however, that this result came 
out quite independently of any such assumption J There are other 
checks on the theory. Two of the correlations reported ‘above 
have not been used. According to the diagram of relations 

^ observed value^ - .52Z. ^ 

is in good agreement. Also 5 7 f . 

The agreement with., the observed 
value of -t .570 is not as good as in the previous case, but 
considering the small number of years, is not bad. 

The absence of elasticity of supply in the case of .potatoes 
applies only within a single year. The fact that the supply is 
strongly correlated with the price of the preceding year 6 5 / 
indicates that in the long run there is considerable elasticity. The 
method of path coefficients readily lends itself to deduction of 
this' long time elasticity. 

Let A and 6 be the hypothetical averages of /i 

and B respectively over an indefinite Crv) period of years. The 
problem is to deduce the 
elasticities toward which 
the long time supply and 
demand curves tend, 
from knowledge 'merely 
of ,the correlations from 
year to year. The fol- 
lowing equation can be 
written from figure 32, 
where ^ ' and 

^ are ' path .coefficients pertaining to the paths indicated. 



7 Iti twd other cases studied by this method (F. G, Wright 1928) very 
diflFerent results were obtained. In the case of butter, the elasticity of supply 
came out 1.43, , of demand — .62. In the case of, dax seed, the elasticity of 
s-npply' came out even greater, 2.39, while that of demand was — .80, But 
these 'are cases in which a high elasticity of supply is to be expected on a 
■pripri ' groti,iids,,^ It 'is interesting to note that In cases in which it seems 
Jttstiiiable to assume a pdon that there /s no elasticity of supply (e^o) , it 
follows that ^ 3 (still assuming Ji finally 

that Si — J — , 



SEIVALL WRIGHT 


203 


(66) 





-- vcCp,^j,^ 


/ - 7&J s, 

(67) 

A. - Ttc A- ~ 

P B 


) 


' >7C 7^ ('s, .72.^^ PS = 

''-A 5, 


( 68 ) yi__ r- 

& 


' ^9 (ir ^ - - 5 ^ JtilM^zim 

( 69 ) ji_, , = 




( 70 ) 

= 

X-p^ 

TL 

= >C 

<r^ 

-jfe. 


( 71 ) 

«» 

. £I 

?x 


<f^ 


•■■ <^a -' 

61 




Let and C^_ be the elasticities of long time demand and 
supply respectively. 

(ir’~i- I n \ f . » ^ \ . . 


( 72 ) 


( 73 ) 




^55- '^5 


'^•p B ^p 


1 * 

h ' j\ticpsa 


C 77C 


= 

7 % 

"fi- 


^ ^ .^- ~ TZ. ■ 




rvt 





204 


THE METHOD OF PATH COEFFICIENTS 


Thus a reaction of price of one year on the supply situation 
of the next does not tend to produce any difference between long 
time and short time elasticity of demand. It does make a differ- 
ence, however, in long and short time elasticities of supply. In 
the case of potatoes substitution of values already found gives 
, Sz as the elasticity of the long time supply curve, insofar 
as determined by the reaction of the price of one year on the 
supply of the next. If it were legitimate to assume that there is 
no elasticity of supply within a year (e-o) ^ the formula for 

reduces to s S =. ^ ^ . 

^ <r^ 

Tests of Significance 

In considering the reliability of path coefficients there are two 
questions which must be kept distinct. First is the adequacy of 
the qualitative scheme to which the path coefficients apply and 
second is the reliability of the coefficients, if one accepts the 
scheme as representing a valid point of view. The setting up of 
a qualitative scheme depends primarily on information outside of 
the numerical data and the judgment as to its validity must rest 
primarily on this outside information. One may determine from 
standard errors whether the observed correlations are compatible 
with the scheme and thus whether it is a possible one, but not 
whether it correctly represents the causa! relation. 

Having accepted a certain scheme with which the data are 
compatible, one would like to determine the reliability of the val- 
ues reached for the path coefficients. Obviously no single formula 
can be given, applicable to all cases. The basic . formulae of the 
method are ones for writing series of simultaneous equations, 
which must be solved to obtain the unknown path coefficients and 
correlation coefficients. These equations are in general non-linear 
with respect to the unknown quantities, ' making it impossible to 
express the ' solution in a general formula in ■ which substitution 
oil' be; made in routine fashion. ' , 

■'V:\yC>6;rtaitt; principles can, .however, be illustrated by the results 



SEWALL WRIGHT 


205 


in simple cases. No attempt will be made here to deal with the 
complications due to small numbers. It will be assumed that the 
errors of sampling are in general so small in comparison with the 
values of the coefficients that second degree terms in the errors 
may be ignored. It is recognized that a more thorough treatment 
of the matter is much to be desired. 

The simplest set up (Fig. 33) is that in which one variable 
is represented as a function of another , and of a residual fac* 
tor . The equations are as follows : 


(74) 

(75) 

(76) 

From* (75) 






<c = c 


From (74) 


/V 




( 77 ) 

(assuming as notecj above that <5/3, 

and are small compared with and )• 



Fig. 33 


( 78 ) 

( 79 ) 








= - li. 








N 


The standard error of the residual path coefficient in a system 
in which one variable is represented as deter- 
mined by a number of others (Fig. 34) may 
be derived , similarly 

( 80 ) - 

'Consider next the case in which variable 
V is a function of two uncorrelated varia- 
bles Y and , and of residual, factor vh_ - 

f 




206 


THE METHOD OF PATH COEFFICIENTS 


Two different solutions are obtained for (fp depericiing on the 
point of view. If it is accepted that and 14 are wholly inde- 
pendent, except for the accidents of sampling, 

we have 

(81) f?, = 

(82) (Tp = H = 6 


/V 



If, however, there are no grounds for treat- 
ing' and as independent, except that Fig. 35 

was insignificantly small in the data at hand, the proper set up 
is one in which a correlation between VJ' and \/^ is indicated 
as in Fig. 3S. 

(83) 

(84) 
giving 

( 85 ) 






-^,3. 


Treating sampling errors as differentials 

(86) rp - 

In the present case, (but not Fa.,, ) is assumed to be 
zero in the sample at hand. Thus 


( 87 ) 


FF * Fa. 

01 at 


■^oz 


( 88 ) 


(Tp 


^ 6'„ tlJi, -m 




where the product moment of deviations of and 


*’x ^ iJL W az /a/ 

by the fonmda of Pearson and Filon. Again treating as 
negligibly small. 


(90) 

f91) 




^ ^(Cf^ Cl } 




ox 



207 


SEW ALL WRIGHT 

This is smaller than the value of obtained on the assump- 

tion of independence of V, and ^ is less than but 
larger for larger values of . 

If the correlation between the two known factors l/ and l4_ 
of figure 35, is not negligible, the squaring of the full formula 
for S ' and division by /V . leads after some reduction to the 
formula 




A somewhat rough estimate of the standard errors in the 
analysis of birth weight of guinea pigs 171, can be made 

by this formula. The correlation between birth weight and size 
of litter was however based on larger numbers 
(3353) than the correlation involving gestation 
period (1317). Adopting the smaller numbers 
we find 

- -.5i L . o-xo 

T = +.30 t .OZ.Z . Fig. 36 

'13G 

While these estimates of the standard errors do not take cogni- 
zance of the approximation involved in substitution of gestation 
period for observed interval between litter (estimated , ?5) 
they are sufficient to indicate that the calculated path coefficient 
can be relied upon as accurate to a first order, assuming the cor- 
rectness of the set up. 

The standard error of a path coefficient has not been worked 
out for systems in which one variable is represented as affected 
by more than two known variables. The standard error of the 


closely allied concrete regression coefficient is however well 
known and can be used in testing significance. 


Since 7 ^^ 


" , the variance of the path coefficient 


can be written 



208 


THE METHOD OF PATH COEFFICIENTS 


f95) C ^ : , if ^ can be treated 

as constant. This probably gives fairly good approximation in 
any case and is so used by Brandt (1928). In the case of guinea 
pig weight discussed above, the correct formula gives a result a 
little smaller than this approximation. 

It will be noted that the standard errors may take very high 
values if the independent variable under consideration (Vj ap- 
proaches complete determination by the others in the system, i.e. 
if approaches 0. In general, coefficients for paths 

leading from variables closely correlated with each other are sub- 
ject to large standard errors. In making up a system, whether 
for prediction purposes or interpretation the aim should be to 
select factors closely correlated with the dependent variable but 
as nearly independent of each other as practicable. 

If the dependent variable is completely determined by the 
specified factors ‘ standard error of the con- 

crete partial regression coefficient becomes zero. This is not the 
case with that of the path coefficient. Thus in the two factor 
case discussed above 


(r\ 

/V ■ 




More generally, if can be treated as constant (as it can 


= ^7 


= Pa, 


which is in J^eement with the preceding result. 

Another simple set up, which is of interest is that in which 
three variables are arranged in chain sequence . 

Here again the point of view makes a ^ 
difference. If the above relation is mere- 
ly an orapirical cme, the ^tuation is mere- Fig. 37 



SEW ALL WRIGHT 


209 


ly a special case of that just discussed (the case in which p ^ 


■K. 0 


otncC 

: = ) 

. By substitution 

(98) 

1, 

■ irf 



N L 


(99) 

»z. 



(100) 

x. 

(Tp 

’ox. 




If, however, is represented as the sole intermediary between 
Vo and Vl on theoretical grounds, the result is different. Two 
different determinations can be made of CL and of 'Tp the 
reason being that more equations can be written than there are 
unknown path coefficients. From, ^ 


(101) 

2. 

ol 



. From P = 
a 

A a. 

(102) 

i 

rv 

6-^:x/-o 

< 

. From F = 

/X 

> 

(103) 

1- 

’ {2- 

L 

“ M 


. From = 

■^4#/ 

(104) 


/V 

< 

X 



Similarly two determinations i:an be made of From 

(105) 

(106) CTp ^ ^ [o-X.f- {/-X)0-XJ]. 

01X 

With standard deviations calculated from two independent 
sets of data in each case, a combination estimate, smaller than 
either oan be' obtained from the ■ formula 



210 THE METHOD OF PATH COEFFICIENTS 

This illustrates the important principle that where there is a su- 
perfluity of equations for determining the path coefficients, the 
standard errors of these are correspondingly reduced. In the 
analysis of corn and hog correlations 42 path coefficients were 
found with which 510 correlations (and 4 cases of complete de- 
termination) were in agreement to the extent expected from their 
standard errors. Calculation of the standard errors of the path 
coefficients in this system seems out of the question, but it may 
safely be assumed that values of the order ^ , which might be 
based on 42 equations are to be reduced by considerable amounts 
by the superfluity of data available. 

There are some interesting contrasts in the standard errors 
given above. If is large, may be large in the empirical 

system. But if the theory that X, is the only intermediary rests 
on adequate grounds, independent of the observed correlations, 
may be small with large . 

We will conclude with consideration of a set up like that used 
for the relation of supply and demand to price and quantity. It 
will be assumed first that the number of cases is large (a condi- 
tion contrary to that' found in the 
examples given). Differentiation 
of the 5 basic equations gives 5 
equations expressing the relations 
between small deviations 'of^ the 
path coefficients and correlations. 38 

( 108 ) ( 113 ) ^ ^ x: cJ 

(109) / (114) z ^ i 

( 110 ) ( 115 ) 

(HI) (116) ^ ^ 5 ^ 

(112) (117) * V S 




SEJVALL WRIGHT 


211 


Thus 



(118) 



(119) 

O' ^ ^2" 


(120) 



(121) 

Az - ^ ■ U - ^ ' 

S Bp’ 


Solution of (120) and (121) as simultaneous equations gives 
expressions for and in terms of 

from' which their squared standard errors can be found by taking 
the average squares. Letting /f , 13 and <CL be the coefficients^ 


( 122 ) 

( 123 ) 


^ /? <5^ f- B f- C. 

0^ rtSl 0a(^ B7=^ 


/! cc 


■:iAB 


BA 


-f- ^ AC 777 

P<SL I3P 


f- - 2 . 6C 777 


BP 


The product moments of the deviations of the correlation co- 
efficients can be found by Pearson & Filon's formula cited on 
page Z06, 

The standard errors of and can be found at once 
with the help of equations (11) and (12) while that of can 
be found from (9) or (10) after expressing (or in 

terms of deviations of the known correlation coefficients. 

The significance of the coefficients ' of elasticity is most easily 
investigated by taking these on scales in which the standard errors 
of the percentage deviation in price and quantity are taken as, 
unity i.e. by finding the standard error of ^ and of ^ instead 

of e ~ ~ and 7? ~ respectively. These standard 

errors can be found from the formula for the standard error of' 


a ratio. 

( 124 ) 






7^" 



The product moments of the path coefficients can be obtained 



212 


THE METHOD OF PATH COEFFICIENTS 


by squaring equation (8) after expressing and J'p in terms 
of and equation (13) or the converse. 

The numbers of cases in the actual examples were not large 
enough to make the method a satisfactory one. The calculations 
have been carried through, however, with the results given below. 



Summer Pork 

26 Years 

Winter Pork 

44 Years 

Potatoes 

19 Years 


-.63 t JZ- 

~ . & s t 

.OS 

- , SS 

t ,a<£, 


-,</7 t ,/<b 

63 t 


— - 5"6 

t . >4 

-^610 

-.ip4 t .IZ- 

i-. g-3 jt 

.05 

f- . 6 5" 

t . t4 

'A 

± .XO 

H-. 64 t 


■t .S'O 

t .2& 


-.73 ± . /f 

- 74 t 

,0^ 

SC 

t ./5 


f-./3 - .2.4 

/ i 


-t, OX 

t .27 


f ,‘^9 t .a3 

-p, 9? i 

.6! 


t. , o€i 

5 

t JX 

-p. s4 t 

.as 

f' ,6 

t .i4 


f-.n ir .39 

+-^ lU t 

JT 

f-.as 

t .57 

Nfi. 

t .3g 

- f. 3 X ^ 

J7 

-f. !(p 

t ,zf 

e 

f- .13 

+ .tl 


f, 0 3 


n 

- .94 






The most nearly satisfactory case is that of winter pork based 
on rather large primary correlations obtained from 44 years" ex- 
perience, 'but even here, the standard error of is nearly' as 



SEW ALL WRIGHT 


213 


large as itself. In the other cases, the standard error of is 
larger than . The term 3 ^^/' omitted in equation (7) is of the 
order of the term 3 ^^ or larger, making this equation invalid. 
The other equations are not affected, at least to anything like as 
great an extent. An approximate solution can be obtained even 
though equation (7) is omitted, from the consideration that in 
this case in which is small, must be very small and may 
be ignored. Thus 


( 125 ) 

( 126 ) 

( 127 ) 

( 128 ) 


/36i 


The results are substantially the same as those obtained above, 
since the values assigned <5^^ were. very small, even if not reliable. 
It may be safely concluded that winter pork at the large markets 
has very little elasticity of supply but a moderate elasticity of 
demand. The results for summer pork and for potatoes are in 
harmony with similar interpretations but are based on such inade- 
quate numbers as to have little significance in themselves. 


REFERENCES 

Brandt, A. E., 1928 — Calculation and use of the standard deviation of par- 
tial regression coefficients. Iowa St. College Jour. Sci. 2: 235-242. 

Burks, B. S., 1928 — The relative influence? of nature and ' nurture upon men- 
tal development ; a comparative study of foster-parent — foster-child 
resemblance and true parent — true child resemblance. 27th Yearbook 
of Nat, Soc. for Study of Education, 1928, Part 1:219-316. 

Calder, A., 1927 — The role of inbreeding in the development of the Clydes- 
dale breed of horse. Proc. Roy. Soc. Edinb. 47: 11^140. , 

Fisher, R. A., 1925 — Statistical methods for Research Workers. 239 'pp. 
Oliver & Boyd, Edinburgh. 

Jennings, H. S., 1916 — The numerical results of diverse, systems of breed- 
ing. Genetics 1 : 53~S9. 

Kelley, F. L., 1927 — Statistical Method. 390 pp. The ^Macmillan Co. New 
York. 

Krichewsky, S., 1927— -Interpretation of Correlation Coefficients. 'Ministry 
of Public Works. Egypt. Physical Dept. ' Paper No. 22,'. Cairo. 



214 


THE METHOD OF PATH COEFFICIENTS 


Lush, J. L., 1930— The number of daughters necessary to prove a sire. 
Joar. Dairy Sci. 13: 209-220, 

— — 1932— The amount and kind of inbreeding which has occurred in the 
devebpment of breeds of livestock. Proc. 6th iiiternat. Congress of 
Genetics 2: 123-124 

McPhee, H. C., and S. Wright, 1925— Mendelian analysis of the pure, breeds 
of live stock. III. The Shorthorns. Jour. Hered. 16: 205-215. 

1926— Mendelian analysis of the pure breeds of live stock. IV. The 

British Dairy Shorthorns. Jour. Hered. 17 : 397-401. 

Minot, C. S., 1891— Senescence and rejuvenation. Jour. Physiol 12: 97-1S3. 
Niks, H. E., 1922 — Correlation, Causation and Wright’s theory of path 
coefficients. Genetics 7: 258-273. 

1923 — The method of path coefficients, an answer to Wright. Genetics 

8 : 256-260. 

Smith, A. D, B., 1926 — Inbreeding in cattle and horses. Eugen. Rev. 14: 
189-204. 

Wright, P. G., 1928— The tariff on animal and vegetable 'oils. 347 pp. The 
Macmillan Co., New York. 

Wright, S., 1918 — On the nature of size factors. Genetics 3: 367-374. 
1920 — The relative importance of heredity and environment in deter- 
mining the piebald pattern of guinea pigs. Proc. Nat. Acad. Sci. 6: 
320-332. 

1921a — Correlation and Causation. Jour. Ag. Res. 20: 5S7-S8S. 

1921b — Systems of mating. , Genetics 6: 111-178. 

1922 — Coefficients of inbreeding and relationship. Am. Nat. 56 : 330- 

338. 

——-1923a — The theory of path coefficients — a reply to Niles^ criticism. 

Genetics 8: 239^255. 

1923!>— Mendelian analysis of the pure breeds of livestock* 1. The 

measurement of inbreeding and relationship. Jour. Hered. 14: 339-348. 
IL The Duchess family of Shorthorns as bred by Thomas Bates. Jour. 
Hered. 14 : 405-422. 

192Sa^ — Corn and hog correlations. Bull. No, 1300, 60 pp, U. S. Dept, 

of Agric. 

— ~1926a— A frequency curve adapted to variation. in percentage occur- 
rence. Jour. Amer, Stat. -^soc. 21: 162-178, 

— 1926b— Effects of age of parents on characteristics of the guinea pig. 

Amer. Nat 60: 552-559, 

— — —1927 — The effects in combination of the major color-factors of the 
guinea pig. Genetics 12:' S30-569. 

1931a— Statistical methods in biology. Jour, Amer. Stat. Ass. Sup- 
plement. Papers and Proceedings of the 92nd annua! meeting. 26: 155- 

,163. 

—1931b— Evolution in mendelian populations.' Genetics 16: 97-159, 

1932a— On the evaluation ,of dairy sires. Proc. Amer. Soc, Animal 

Prod. 1932 : 71-78. 



SEW ALL WRIGHT 


2!5 


1932b— General, group and special size factors. Genetics 17 : 603-619. 

Wright, S,, and H. C. McPliee, 1925 — An approximate method of calculat- 
ing coefficients of inbreeding and relationship from livestock pedigrees. 
Jour. Ag. Res. 31 ; 377--383. 

Wright, Sewall, 1933a — Inbreeding and homozygosis. Proc. Nat. Acad Sci« 
19: 411-420. 

• 1933b — Inbreeding and recombination. Proc. Nat. Acad. Sci. 19:420- 

433. 







MATHEMATICAL FOUNDATION FOl A METHOD 
OF STATISTICAL ANALYSIS OF HOUSEHOLD 

BUDGETS 

Bj John W. BoL»Y$tEFF 
Harvard University 

The object of this paper is to offer a satisfactory method of 
statistical analysis of household budgets in accordance with the 
general principles of mathematical logic. I have, therefore, taken 
these words of Fourier: ""Mathematics has no symbols for con- 
fused ideas^"^ as my guiding light, and set out to effect a simple 
and) comprehensive analysis of the general type of statistical data 
which is included 'under the heading ""household budgets/’ i. e. 
monetary incomes and expenditures of these incomes. 

I have tried to lay the greatest stress, accordingly, on the 
clarity and terseness of the exposition rather than indusiveness, 
attempting to diminish to the utmost the number of undefined 
ideas and the undemonstrated propositions. I make no special claim 
to originality and base my method upon the works of numerous 
previous investigators, summarizing analytically old principles and 
ideas on the bases of mutual consistency and reducibility to more 
ftmdamental principles. This paper is specially framed to relieve 
the feeling of intellectual discomfort which of late has been trou- 
blesome to ccmsdentious investigators in our field, so overcrowded 
with revelations of numerous parts, rather than with indications 
of the mode of combination of the major components within the 
whole. I address here the properly instructed mind and so dis- 
pense at times with the elaboration of some statements. 

In this summary, therefore, we shall be concerned with laying 
down' a rigid method for analysing the budgetary, data, defining 
thdr s^ope formally to indude only the .monetary incomes and 

^ Quoted from J. A. Schumpeter, Die Wirtsckaftsikeime der Gegei^ 

^ wmrf^ Wien, I, 11, 19^. 



JOHN W. BOLDYREFF 


217 


the relative amounts of these incomes spent in a defined manner. 
‘This^ naturally^ excludes all reference to economic theory (e. g. 
utility, demand curves, etc.) from our discussion; I do so not 
because of a desire to depreciate the importance of that kind of 
belief, but because I do not wish to consider it here. 

Obviously, “it is never a mathematical proposition which we 
need, but we use mathematical propositions only in order to infer 
from propositions which do not belong to mathematics to others 
which equally do not belong to mathematics.'"’^ Moreover, it is 
also trae that nothing can be purely logical or mathematical (un- 
less we follow Hilbert and define mathematics as a game with 
meaningless marks on paper) ; all propositions involve some psy- 
chological terms such as defining, meaning, asserting or naming. 
The method and scope of a mathematical analysis is in a like 
manner dependent on the purpose for which it is to be undertaken. 

The purposes in the study of budgetary data assume varying 
emphasis depending on the point of view of approach, that of 
economics, home economics, social welfare, and sociology.^ All of 
these approaches are concerned with the relation between the sizes 
of incomes and the relative amounts spent for certain goods and 
services. 

Generally, the classification of expenditures of an income is 
made as to the amounts (or proportions) spent for food, clothing, 
rent, light, education, health, recreation, savings, and amusement. 
Some investigators limit their classifications to five items: food, 
clothing, rent, fuel and light, and sundries (everything not in- 
cluded under the first four). Others prefer to subdivide the 
classification further and break up each of the above nine types 
of expenditure into what they deem to be its component parts, 
and proceed to study these new relationships and to generalize 
from them. On my part, I judge the latter performances 'ex- 


2 Wittgenstein, Tractatm Logico-PkUosophicus, 6, 211. 
a C, C. Zimmerman, Am. J. Soc., vol. XXXIII, 6, 192S. 



218 


ANALYSIS OF HOUSEHOLD BUDGETS 


tremely dangerous^ It seems to me, that the analysis of the major 
components of incomers expenditure in' their relationships to the 
size of income and to each other should be developed and per-- 
fected beforehand, and then only gradually extended to apply 
to the minor items. Moreover, the splitting up of a few variables 
(the types of expenditures) into many introduces other difficulties 
— aside from the fact that a study of simple relationships is apt 
to be more clarifying — the introduction of a component part of 
the whole variable as a new variable, immediately raises the ques- 
tion why this component is isolated and not the other. None of 
the arguments that can be generally cited (and usually no argu- 
ments are cited) are really decisive, and the position is extremely 
unsatisfactory to anyone with real curiosity about the fundamental 
relationships. Unless we wish the analysis of the budgetary data 
to remain self -contradictory and meaningless, we must adopt a 
limiting method, and study not more than two variables at a time. 
Then, and only then, can ■ we hope to establish or discover any 
"laws/’ or functional relationships. 

In my experiments to develop a satisfactory method of anal- 
ysis I would begin generally with five classes of expenditures: 
food, clothing, rent, fuel and light, and sundries. Later, I have 
come to the conclusion that some of these tend to have a sort of 
complementary relationship between them. Thus, "fuel and light’’ 
are often higher or lower with a higher or lower "rent,” and in 
some cases a part of "rent” covers "fuel and light,” in other cases 
the discomfort and monetary cost of "fuel and light” lowers the 
"rent” expenditure. Likewise, some complementary relationship 
is observed between "fuel and light” and "clothing” (especially 
in submarginal households) and between "clothing” and "rent” 
(e. g. social demand. of the stylish residential district). .These are 
merely a few examples which led me to question the validity of 
initial isolation of these three items (clothing, fuel and light, 
-and rent) from each, other. Accordingly, I suggest' to limit' our 



JOHN W. BOLDYREFF 


219 


investigation to the study of possible relationships between : ( 1 ) 
the size of the income and (a) the amount spent for food, (b) 
the amount spent for sundries; (2) the amount spent for food 
and the amount spent for sundries — assuming temporarily for 
convenience and analysis all other itemized expenditures under 
rent, clothing, and fuel and light, to be not subject to individual 
isolation. 

As to the unit In household budget, the variety of units 
employed bewilders at first a mathematical student. Of these, 
the old scale ■ of two children for one adult, the various other 
“adult equivalents’’ (e.g. Engel’s quet scale of 3.0 for woman 
of 20 and 3.5 for man of 25 years; Atwater’s scale of 10, 8, 7, 5, 
2.5; then the scales of Voit, U.S.D. of L., H. C. Sherman and 
L. H. Gillett, G. Lusk, L. Emmett Holt, and others — each scale 
giving “adult equivalents” for children, male and female), all 
clearly show inability of investigators to agree on a scale to 
determine the size of a family in standard units. It seems to me 
that the inventors of such scales forget somehow that “taking 
an arbitrary individual in the living nature — man, an animal, 
a plant — it will generally be found impossible to find out another 
individual in all respects identical to the first one chosen.”* The 
standard scale in budgetary studies is less valid than usual statis- 
tical abstractions, for such factors as geographic space (climate, 
nutritive ratio, energy value, cost), social space (stratification and 
differentiation), economic space (size of 'incomes), occupational 
space (caloric requirement, etc.), time factor (daily, weekly, 
monthly, seasonal, and longer fluctuations), as well as age (for 
there is a great latitude in “adult” ages and' a corresponding 
variability in “requirements”) and 'sex differences, are admittedly 
affecting each budgetary individual in a variety of unknown ways. 
In view of the complexity of the problem and the enormousness 
of human population, any “adult equivalent” scale will appear 

V. L. Charlier, Ada Unwersitatis Ludensis, 1905-6, XVI, 5, p. 3. 



220 


ANALYSIS OF HOUSEHOLD BUDGETS 


to be based on samples obtained in gross violation of the sampling 
theory^ for it is very doubtful that, a sufficiently large and rep- 
resentative sample can be secured and it is very hard to see how 
it can escape being greatly biased, BesideSj most of these scales 
are based on energy requirement only, and refer to “food^' but 
not at all to other types of expenditure ; therefore, they would be 
of little general significance even if they were valid in their spe- 
cific aspect. Personally, I must reject all such scales as meaning- 
less and incline to hesitate between adopting a ^‘normal family^' 
(on basis of a standard number of members, irrespective of their 
characteristics) and a ^'household” (irrespective of number of 
members and of their characteristics), the presumption being that 
in a sufficient random sample the differences either way will tend 
to cancel out. This may not seem to be a more accurate method 
than others, but, in all probability, it is just as accurate, and its 
virtue lies, moreover, in the fact that its limitations are all on the 
surface instead of , being hidden away behind a misleading label. 
The data of the last Census seem to favor this attitude/ 

The purpose of budgetary analysis is to discover, allegedly, 
certain functional relationships, if any, between the varjring in- 
come and the relative amounts of each type of expenditure. To 
discover such relationships and to determine them explicitly one' 
must recognize, that all laws Ic^cally function within limits. One 
needs not go as far as Hilbert and insist that anything involving 
an infinity of any kind must be meaningless — ^in pure mathematics 
this may' be a useful abstraction — but it should be obvious that 
in all organic laws anything infinite appears a stupid fiction which 
cannot be argued for except by proceeding to a limit. ' The be- 
, havior ^ of the budgetary items is dearly a biotic phenomenon 
^ which fact some of the investigators in our field tend to overlcNOk 
consistently. If there are any functional relationships in the bud- 


'' , ' ® L. ' E., Tniesdell,, iVew Family Statistics far 1930, J, Am. Statist Assn., 
v,'|€arcli 1933 '.(Sngplemfint), pp. 154-8. 



JOHN W. BOLDVREFF 


221 


getary data these wil! be found only within definite limits of 
minimiim and maximum, and any contradicting evidence to such 
laws if found below or above these limits cannot be interpreted 
as disproving such laws. 

We shall make our points clearer by illustrating the above 
exposition by the so-called Engel’s Law (I am referring to the 
second part of it), incidentally commenting briefly on its validity 
and demonstrating the details of our method. 

It will not be amiss to formulate in a few words the part 
of Engel's Law (1895) we shall be concerned with in our dis- 
cussion. Comparing the incomes of laboring families, middle class 
families, and well-to-do families, Engel conjectured that: 

(1) the greater the income, the smaller the percentage 
of outlay for subsistence (food), 

(2) percentage of outlay for clothing is approximately 
the same, whatever the income, 

(3) percentage of outlay for rent, and for fuel and 
light, is approximately the same, whatever the income, 

(4) as income increases in amount, the percentage of 
outlay for sundries becomes greater. 

Most of the investigators incline to accept the first and the 
last of Engel's propositions, both from the static and dynamic 
viewpoints. As for myself, I like to consider this law with refer- 
ence to the following questions : 

(1) as incomes increase does the percentage of outlay 
for food decline and the percentage of outlay for sundries 
increase? 

(2) is this a static law; i. e. in a given place, at a 
given time, will there be a higher percentage of outlay 
for sundries and lower percentage of outlay for food 
with larger incomes, and, vice versa for smaller incotnes? 

(3) does this hold in the dynamic aspect — as incomes 
'increase (in time) do the percentages of outlay for food 



222 


ANALYSIS OF HOUSEHOLD BUDGETS 


decline and those for sundries rise, for short and long 
time? 

(4) is this law reversible, i. e. if incomes decrease do 
the percentages of outlay for food rise and those for 
sundries decline, statically and dynamically? 

(5) can the percentages of outlay for clothing, rent, 
and fuel and light be treated as constant, statically and 
dynamically ? 

(6) can this law be interperted to mean that when the 
percentage of outlay for food declines the percentage of 
outlay for sundries rises, and vice versa, statically and 
dynamically ? 

(7) if this law is valid, what is its significance for 
forecasting ?_ 

Let us consider first the problem of limits from a purely 
abstract viewpoint. We assume for the sake of argument this 
law to be valid and set up a hypothetical series of incomes with 
the respective percentages and amounts of outlays -for food and 
for sundries. The following example shows clearly that a limit 
is eventually reached when the law becomes automatically in- 
operative. 


TABLE I. 


Income in $ 

% for Food 

$ for Food 

% for Sundries 

$ for Sundries 

Under 900 





— 


” 1,000 

50 


10 

100 

" 2,000 

45 

900 

IS 

$m 

” 3,000 

40 

1,200 

20 

m 

4,000 

35 

1.400 

25 

im 

5,000 

30 i 

1,500 


1,500 

6,000 

25 j 

1,500 

35 

2,100 

7,000 

i 

1,400 

40 

' 2,800 

8,0CKI ' 

15 ' i 

1,200 

45 

3,600 

,9,000 

.. 10 , 

900 


4,^ 

. , ^ io,ooo; 

s 

: 500 

55 

5,5W 











JOHN W. BOLDYREFF 


223 


Aside from demonstrating the inevitableness of limits^ this 
illustration shows also that from purely common sense considera- 
tions constancy of interrelationship between variation of percent- 
ages for food and percentages for sundries is not feasible. That 
the absolute amount spent for food cannot decline with increase 
of income but should constantly keep on rising (though, perhaps, 
in small amounts), should be clear from common sense, even if we 
shall consider this amount as stationary after a certain sum is 
reached and credit the increase to sundries (cooks, maids^ travel, 
eating out, etc.) — ^yet, even in such cases decline should be out of 
the question. 

Now we can give an illustration of the validity of assumption 
that the percentages of outlay for rent, clothing, and fuel and 
light, for convenience of analysis and until proven to be contrary, 
can be held constant. We have tried this with a variety of data 
and generally found this to be true. 


TABLE II. 


Comparison of the Percentages of the Total Family Expenbituee 
F oa THE Different Groups of Living Costs® 


Item 

Bdeffs 73 
Bngiish 
Budgets 
1796 

EngeVs- 
Belgian 
; Data 
1853 

Le Play's 
Method 
100 Budgets 
1829-88 

U.S.D.L. 

(2562) 

! 1890-1 

U.S.D.L. 

(12096) 

; 1918-9 

Groton, 

N.Y. 

(92) 

1919 

Food 

73 

66.9 

56.8 

41.1 

382 

41.7 

Rent 

12 

7.6 

6J8 

is.i 

13.4 

13,1 

Clothing 

7 

14.9 

16.S 

15.3 

16.6 

IIJ 

Fuel and light 5 

5.6 

4J3 

5.9. 

5J 

6,8 

Sundries 

3 

5.0 

15.6 

27,7 

26.4 

27.1 


Adding the ‘Vent'^ and '‘clothing’" items from Table II wc 
obtain : 19.0, 22.5, 23.3, 30.4, 30.0, and 24.4 ; by adding to these 
their respective “fuel and light’"" items' we obtain: 24.0, 28.1, 


® Taken from' NoMe, Cornell University Agricultural Experiment Sta- 
tion Bulletin, # 431, Sept., 1924, 











224 AMALYSIS OF HOUSEHOLD BUDGETS 

27 jS, 36.3, 35 J, and 312. It seen'is justifiable to assume these 
items in their summation to be a constant iactor in time analysis. 
That they are constant for static analysis will be shown later. 
But it 3naj be mentioned in passing that taking the data from 
Noble's Table 19 (average percentages of expenditure of items 
of cost of li¥ing of 5 IS families in New York City, by income 
groups)^ and adding up our ‘^constant factor^' we get 35.9 for 
the lowest income group and 36-.4 for the highest. 

One ilkistration more. Below are the figures taken from 
the U«S, B. L., 18th annual report, 1^)4, p. 101. 


TABLE III. 




Ciassified Income 

Rent 

Fuel 

Light 

Food 

Clot king 


Under $ 200 

16.93 

6.69 

1.27 

S0.8S 

8.68 

15.58 

300 

18.02 

6.09 

1.13 

47.33 

8.66 

18.77 

” 400 

18.61 

S.97 

1.14 

48.09 

10.02 

16.09 

” 500 

18.57 

5.54 

1.12 

46.88 

11.39 

16.50 

” 600 

18.43 

5.09 

L12 

46.16 

11.98 

17.20 

” 700 

18.48 

4.6S 

1.12 

43.48 1 

12.88 

19,39 

800 ^ 

18.17 i 

4.14 

L12 ^ 

41.44 ’ 

13.50 

21.63 

« 900 

17.07 

3.87 

1.10 

41.37 

13.57 

23.02 

1,000 

17.58 

3.85 

1.11 

39.90 

14,35 

23,21 

” . 1,100 

17,53 

3J7 

1.16 

38.79 

15.06 

23.69 

"" im 

16.59 

3.63 

L08 

37.68 

14.89 

26.13 

1,200 and over 

17.40 

3.85 

1.18 

36,45 

15.72 

25,40 


The “constant factor’" taken at the lowest and highest incomes 
is found to 'be 33.57 and 36.15 respectively. The examination of 
the table from the point of view of finding a law, or functiona! 
relationship, reveals such phenomenon for the range' of incomes 
from $500 to $1,2(X), inclusive. We shall proceed to examine the 
data induded in these limits in accordance with our method. 

We find the “constant iactor” for $500^ income to be 36.62 
and for,$l,^X) income, B6. 19. 


^'Op. dt. 








JOHN W. BOLDYREFF 


225 


We assumed a straight line relationship and computed simple 
coefficients of correlation between: 

(1) incomes and percentages of outlay for food 

(2) incomes and percentages of outlay for sundries 

(3) percentages of outlay for food and those for sun- 
dries 

We want to stress in this connectoin that to us the coefficie^^ 
of correlation means ^ measure of relationship which is already 
empirically established, not a proof of such relationship. We 
used L. P. Ayres® formula which we found convenient for com- 
puting purposes. To avoid a fictitious correlation between in- 
comes and the percentages of outlay for food and for sundries, 
we have divided the income column by a constant. To facilitate 
computation we have likewise divided the "‘percentages for food'’ 
and the “percentages for sundries” columns by constants. 

In making a summary comment on Engel's law, I would 
like to stress the following points from a purely methodological 
viewpoint. There seems to be definite evidence that in a given 
place, at a given time, the law holds consistently within certain 
limits. For very low income groups some other law may hold, 
or no law at all, and as to how extremely large incomes are spent 
we do not know. From the dynamic aspect, the law appears to 
have been working from the time of the French revolution up to 
the beginning of the present depression (much evidence could be 
cited to support this fairly well known fact, e.g. works of Schmol- 
ler, Rogers, D' Avene! , U. S. B. L. S. Bulletins, etc.). However, 
the study of W. A. Berridge (The need for a new survey of fam- 
ily budgets and buying habits, N. Y. Times, May 10, 1931, and 
“The Annalist,” July 17, 1931) seems to indicate that from the 
secular standpoint this law is not immediatly reversible, for with 
the shrinking incomes ^ we observe a definite decline in the outlays 


® /. Edtic. Research, I, March-June, 1920. 



226 


ANALYSIS OF HOUSEHOLD BUDGETS 


for all items, including food, except for the outlay for sundries 
which appears to be almost stationary. 

That percentages of outlay for clothing, rent, and fuel and 
light, can be added up and treated as a constant factor both stat™ 
icaliy and dynamically with rising incomes we can be reasonably 
certain of ; what will happen with decreasing incomes in time 
analysis we are not ready to say. However, it must be borne in 
mind that even with the rising incomes the relationship between 
the percentages of outlay for food and for sundries need not be 
perfect as one may be led to think from their high individual 
coefficients of correlation with income in the example given above. 

As to a practical application of the budgetary analysis to 
forecasting, I shall venture to say that in a socially planned society 
(if such society is workable), the study of itemized expenditures 
may prove invaluable. In other societies it may be used to forecast 
some sort of consumption indices — ^if these will be successfully 
computed they will undoubtedly help to flatten the curve of 
business cycles to an appreciable degree. As to how to develop 
these indices, I have no suggestion to make just now, except that 
it must be on basis of extension of a crude analysis similar to 
one offered here, and application of probability technique, properly 
based _ on psychological and historical findings. All I hope to 
have made dear in this paper is that the subject is very difficult, 
and that an analysis offered, here is sufficient as a first step. 

In conclusion, I must stress my indebtedness to Professors 
J. D. Black, J. A. Schumpeter, and C.C. Zimmerman for advice 
and suggestions. I am also grateful to Professor Zimmerman for 
the materials he ' let me examine. But above all I am indebted 
'to Professor W. L. Crum from whom my point of view and 
-P'^bod of 'attack are wholly derived; anything of value that I 
.tmy have said in this paper is due to him. 




ON THE RELATIVE STABILITY OF THE MEDIAN 
AND ARITHMETIC MEAN, WITH PARTICULAR 
REFERENCE TO CERTAIN FREQUENCY DISTRIBU- 
TIONS WHICH CAN BE DISSECTED INTO 
NORMAL DISTRIBUTIONS^ 

By 

Harry S. Pollard 

I. 

The Choice of an Average 

In. any statistical investigation in which an average is to t>e 
used as a summarizing figure for a frequency distribution the 
question arises, which average best describes the distribution. 
That this is still a debatable question among writers on economic 
statistics is shown by a perusal of the many papers dealing with 
the measurement of seasonal variations which have appeared in 
recent years.^ 

Each of the proposed methods of isolating _ seasonal variations 
involves an averaging, either of monthly items or of relatives of 
monthly items, but whether this averaging is best accomplished 
by use of the arithmetic mean, the median, or the mean of a 
middle group of items seems to be a moot point. Persons® employs 
the median of link relatives of monthly items, since by this device 
the influence of large non-seasonal variations may be greatly 
moderated. HaiT* in justifying the use of the arithmetic mean 
has shown that the method of monthly means gives the actual 

* A restirae of a dissertation, bearing the same title, written under 
the direction of Professor Mark H. Ingraham and submitted in partial 
fulhllment of the requirements for the degree of Doctor of Philosophy in 
the University of Wisconsin, 1933. 

* For a bibliography of literature on this subject see Mills, F. C, 
Statistical Methods, p. 343. 

® Persons, W. M., Correlation of Time Series, Jour. Amer, Statis. Assn., 
June, 1923, p. 717. 

*Hart, W. L., The Method of Monthly Means for , Determination of a 
Seasonal Variation, Jour. Amer. Statis. Ass’n., Sept., 1922, pp. 341-349. 



228 


RELATIVE STABILITY' 


monthly ¥alues of the seasonal variation in case the seasonal 
variation is strictly periodic throughout the period of years under 
consideration and the long term variations are also periodic 
with integral numbers of years as their periods. The proof of 
this theorem is based on a property of Fourier series discussed 
by Bocheri, 

The point of view of this paper is that another factor of 
im’portaiice should influence the choice of an average^ that this 
choice should be guided not alone by consideration of exceptional 
cases which may arise, nor by theory which assumes a. periodicity 
seldom found in sequences of economic data, but also by a con- 
sideration of the stability of the averages. For if a given fre- 
quency distribution is regarded as a random sample drawn from 
a theoretical distribution which contains a very large number of 
itcmsi the accuracy with which a particular average of the sample 
will typify the entire theoretical distribution is influenced by the 
frequency curve for that average. It is the purpose of this paper 
to compare the stability of the arithmetic means and medians of 
frequency distributions which may be dissected into two and 
three norma! distributions, and to develop a genera! method of 
comparing the relative stability of the mean and median which 
shall be applicable to any frequency distribution. 

The dissection of a frequency curve into two normal compo- 
nents has been discussed by Karl Pearson®, who has developed 
methods for determining the values of the parameters of both sym- 
metrical and asymmetrical frequency functions* He has applied 
'/these methods to distributions of cranial weights. Cram’^ has used 
■Pearson^s method of ^ dissecting a symmetrical distribution in his 
discussion of the relative stability of the median and mean of link 

of MaikemaikSf Second Scries, vol. 7, p. 135, Formula (63). 

^ ' '* Pearson, . K, Cof9$ribuli&ns fo the Maihematical The&ry of Evoluiim, 
PMImbpMmi Tfmmefims, .Series /A,. 'vol. 185*. 1894 , pp. 71-110, 

W.L., The ‘ tise of the 'Medium in Determining Seasemed 
VwmtkMp Jcwri Amer. Statis. jytarcli, 1923, pp, 607-614. 



HARRY S, POLLARD 


229 


relatives of monthly figures for the rate of interest on sixty to 
ninety -day commerda! paper for the years 1890-1917, and his results 
are discussed in section VI of this paper. Our interest in an asym- 
metrical distribution composed of two normal distributions arises 
from the fact that such a distribution affords a good fit both to 
distributions which possess two distinct modes, and to skewed dis- 
tributions with one mode. The study of a distribution which may 
be dissected into three normal components is suggested by the occur- ■ 
Fence in economic data of tri-modal distributions. This paper will 
be concerned with only a particular class of three-component dis- 
tributions, those which are symmetrical. 

The hypothesis from which this investigation started was that 
a good criterion for measuring the stability of an average is its 
standard deviation. However, a difficulty which soon presented 
itself was the accurate determination of the standard deviation of 
the median. The classical formula for expressing the standard 
deviation, dp, , of the medians of samples of s items each, drawn 
from a frequency distribution whose equation is ^ = jf(x) and 
which satisfies the condition: 

O ^ J 

f:fC^)dx = i = jf(^)dx. IS = 2 iT ’ 

-OO o V 

The approximation to the value of the standard deviation of the 
median given by this formula is discussed in section IV, where it 
is shown that, although this approximation is dose to the true 
value ot the standard deviation of the median when S is large, it 
may he a very |x>or approximation when S is small, particularly 
for certain types of frequency curves. 

Since it became obvious that the relative stability of the medi- 
ans and arithmetic means of small samples cannot be determined by 
the methods which are valid when the samples, are large, this paper 
resolved itself into two distinct investigations : a treatment of 
certain frequency functions using the classical formula for the 
standard deviation of the median, valid for large values of 5^ 



230 


RELATIVE STABILITY 


and the do/elopment of a second method of comparing the stability 
of the arithmetic mean and inedian which niay be applied also when 
3 is siiiali„ The first of these topics is considered in sections II 
and III, the second is taken tip in sections IV and V, and is math- 
eiBatically the more interesting part of the work. In section VI 
the various methods of comparing the stability of the arithmetic 
mean and median are applied to a particular sequence of economic 
data. 


IL 


The Relative Magnitude and Stability of the Arithmetic 
Mean op a Frequency Distribution which Is Composed 
OF Two Normal Distributions. 


i. llte Mean and Medimt and Their Standard Deviations. 

In this section a study will be made of the frequency function 


whose equation is 

(O 7 


1 




(Sl 


2or^ 




A 


<51 


3 


with the purpose of determining the influence of the five parameters 
of this equation upon the location of the mean and median of the 
distribution, and upon the standard deviations of these averages. 

The only conditions imposed upon the parameters ^ ^ ^ 

62 f are that they shall assume only positive values (since they 
represent, respectively, the areas of the two com|X)nent curves, their 
standard deviations, and the distance between their arithmetic 
means), and that the first twO’ parameters shall satisfy the' equation 


(2) 'q. = A 

oo 

so that the total probability, as represented by J dx. ^ shall 
be, unity. 

The arithmetic mean, x , of the distribution may f)€ expressed 
as a function of the parameters by the equation 

i3) X - 



HARRY S. POLLARD 


231 


( 4 ) 


The median, M , of the distribution satisfies the equation 
£ + :r- £ ) <^X — , 

a; / 2 


OCJ 

vTF V 


and can in genera! be located only by interpolation in a table of 
areas under the normal curve. This interpolation can be more easily 
perfornied if equation (4) is transformed into 

r € t 


( 5 ) 


/ 


dt - 


' di: 


In distribution (1), <5^ and Cj denote the standard deviations 
of the component distributions, measured from the means of the 
respective components. Hence, the standard deviation, <f , of the 
entire distribution satisfies the equation 

(T = \/c> t C . 

Therefore the value of the standard deviation of the arithmetic 
means of samples containing 5 items each drawn from distribution 
(1) is 


( 6 ) 








If we assume that 5 , the numf^er of items in the sample, is 
sufficiently large to justify its use, an approximation tO' the standard 
deviation of the median may be obtained from the equation 


( 7 ) 


= 

Pi 




where 


-H 


n- 


2. The Relative Magnitude of the Median and Mean. 

From equations (3) and (5) it is seen that, if four of the five 
parameters, ^ ^ fixed and the fifth is allowed 

to vary, both x and /V wdll be monotone increasing functions of 
iS" and of , and monotone decreasing functions of , and that 
X is independent of the standard deviations of both components, 
while A/ is a monotone increasing function of 07 and a monotone 
decreasing ^ function of . 



232 


RELATIVE STABILITY 


When X! and CT - fl 

/ 2- I ^ 


distribution (1) becomes symmetrical, and 

X == A? = 

To obtain conditions tinder which x shall exceed M , let equations 
(3) and (5) be differentiated with respect tofr. The inequality 


may be reduced to the form 




cL/i 
d. ^ 




It follows from equation (5) that when x:, > , 

-jT - 

A r /l .- \ ^ whence y C ^ 

(T ^ (T ^ / 

’Hence the inequalities 

( 8 ) ^ 

are a sufficient condition that shall exceed , and since 
X - " o when 4-^0 , inequalities (8) are sufficient to insure 

that for positive values of ^ , x will exceed M . 

In the case of many frequency distributions whose form sug- 
gests dissection into two normal components it is found that the 
standard deviations of the smaller component exceeds that of the 
larger com|X)nent* Hence, condition (8) is fulfilled, and x differs 
more from the mean of the larger component than does A/. 


5 - Reiafipc Stability af iMcdian and Mean far the Specml Case, 4 


From equations (6) and (7) it is seen that w'hile ,and ^ 
are both' monotone increasing functions of^, they do not possess 
a monotone character with resjiect to the other parameters of equa- 
tion ( 1 )* ^The development of general conditions which the param- 
eters Must 'Satisfy in order that may exceed is im'peded by 



HARRY S. POLLARD 


233 


the fact that h is defined in (5) by an equation containing integ» 
rals, and its numerical value, for given values of the parameters, 
can be obtained only by interpolation in a table of areas under the 
normal curve. We shall therefore determine the relative stability 
of the median and arithmetic mean, as measured by the standard 
deviations of these averages, for certain special cases of distribu- 
tion (1). 

If, in equation (1), ^ is assigned the value zero, the distribu- 
tion becomes symmetrical and x - = o . Hence the condition 

for equal stability of median and arithmetic mean, , may 

in this special case be written 










2- Si.**"' 

Letting the ratio, ^ , be denoted by ^ , we obtain 

This fourth degree equation in possesses two positive real roots, 
independent of , and^, iot jAlo) and/6«=>) are both positive, 
while fo) = ^ ^ < c? . 

Hence there exist two values, j°<l and /? > / » such that when /o 
assumes either of these values the standard deviations of the arith- 
metic mean and median are equal- For values of /> in the interval 
the standard deviation of the arithmetic mean is less 
than that of the median. For values oi p outside this interval, 
the standard deviation of the arithmetic mean is greater than that, 
of the median. Hence it is seen that, for # = o , the relative stability 
of the median and arithmetic mean of distribution (1 ) is determined 
by the ratio of the standard deviations of the two component curves. 
Yule® has discussed the relative stability of the median and 
arithmetic mean of distribution ( 1 ) when, in addition to, the con- 
dition the distribution is subjected to the further restriction 


^Yiik, G. U., Am InirodMctim ic the Theory of StaiisUcs^ Sth ed,, 
p, 339. 



234 


RELATIVE STABILITY 


and has obtained the numerical values of p for which the two 
averages will possess equal standard deviations: 


= Z,Z3^0 . 

4. Relative Stability of Median and Mean for the Special Case, 

Now let the restriction S' - o be removed. Let ^ assume 
any positive value^ and let the condition ^ o, 3 ^ 

lie imposed. The upper limits of the integrals in equation (S) will 
then be equal, whence 




SrOj 


6 -^ 


X ™ 


z 


The relative magnitude of the median and mean is seen to 
depend upon the standard deviations of the component distributions, 
and X is greater than, equal to, or less than M according as 
^ is greater than, equal to, or less than unity. 

To obtain conditions for equal stability in the two averages, 
let (T^ he set equal to ^ . By introducing the notation, 


this equation may be reduced to the form 


(10) (4 lz)jO ^ ^ y r ^V;t / ^ 

Taking fl ^ {Pt- ¥/o) as a new variable, this equation may be 
written as the quadratic 

z) T-h -h BTe =0^ 

whose roots, are both real for all values of Furthermore, since 
Tc y xC-k\i) for all values of one of these roots is positive 
and greater than 2, and therefore has a value which 
may assume. 

Hence, for all values of -if ’"(and therefore for all values of 
there exist two reciprocal values of /» , ( /S’ and /^), such 



HARRY S, POLLARD 


235 


that when /O assumes either, of these values the standard deviations 
of the arithmetic mean and median are equal. For values of p in 
the interval ( /=» <p the standard deviation of the arithmetic 

mean is less than that of the median, and for values of p not in 
this interval, the standard deviation of the arithmetic mean is 
greater than that of. the median. 

Yule's results show that when p = andyg=a,234o^ 

and therefore that the mean and median are equally stable when 
the standard deviation of one component is approximately 2,25 
times that of the other. It remains to investigate the behavior of 
the interval (P varies. 

Since it is the ratio of the standard deviations of the component 
curves, and not their actual values, which determines this interval, 
suppose the unit of measurement to be so chosen that 
whence Sr , and 

1 Z ) f" 2. \[ 2 f- / 

^ ' 

Then since JTe C&h) } Z for all values of i?* , A uiay be shown 
to be a monotone increasing function of and therefore a mono- 
tone increasing function of S- (positive). Since, furthermore, A 
is an increasing function of p (when /c? > / ), it appears that the 
interval <P in which the standard deviation of the arith- 

metic mean is less than that of 'the median (i. e., the interval in 
which the mean is the more stable average), becomes larger as 
the size of ^ is increased. 

Summarizing, for the special case of distribution (1), in which 
which the areas of the two component normal 
curves are equal), the relative stability of the median and mean 
depends upon the value of p , the ratio ol the standard deviations 
of the two component curves, and upon the value of , the distance 
between their ■ means. When p equals one, .the mean is the more 
stable average, independent of-^» Furthermore, for all positive 
values of ^ there exists an' interval of values of including 
p ^ i , within which the. mean is m^ore stable, at the end 'points of 
which the averages are equally stable, and without which the median 
is more stable. , When, “^-o,'this 'interval is 
,and as'^ .increases the, interval'; becomes "larger. , , 



236 


RELATIVE STABILITY 


It was stated at the beginning of this section that, on account 
of the approximation to the value of which is used, the conclu- 
sions will apply to distributions containing a large number of items* 
It should be noted that, in the special case which has Just been 
considered, to assign a large value to & may cause the iriediao 
to fall at a point of relatively small frequency^ in which case, as 
will be shown in section IV, the approximation to the standard 
deviation of the median, , will exceed its true 

value, and the superior stability of the arithmetic mean, as obtained 
from equation ( 10), may be exaggerated. In such cases the probable 
errors of the averages should be computed by the method of sec- 
tion V. 

5. Relative StabilUy of Median and Mean for the Genera! Dis- 
tribution (1). 

Finally, let the restriction be removed from distribu- 

tion (1). The condition that the standard deviations of the median 
and mean of this general distribution shall be equal may t)€ written 




K 

z 




Let the notation 


- M 

02 




0- e 


- —1 

/?= aP , -i--. ^ ^ e w = e 

be introduced. The parameters /> ,-4 , | ,777 may thus assume only 
positive values, and ^ and w are not greater than unity. Then 




This equation may possess two real positive roots. , As /=* ap- 
proaches zero or positive infinity, it is seen that f(P) becomes pos- 
itive, independent of and and therefore that 'the standard 
deviation of the mean exceeds that of the median. If the equation 
possesses two distinct positive roots, there will be an interval of 
, 'positive values of for'which the standard deviationof the xnediati 
wilt exqeed that of the mean. However, this interval does 'not 


necessariljvC'Ontain the, value, ^ as. in' the'^ special case where 



HARRY S. POLLARD 


237 


III. 

The Rei^ative Stability of the Median and Arithmetic 
Mean of a Symmetrical Frequency Distribution Which 
Is Composed of Three Normal Distributions. 

1, The Relative Magnitude of the Standard Deznafwns of the 
Median and Mean, 

It will be the purpose of this section to investigate the relative 
stability of the median and arithmetic mean of a symmetrical 
frequency distribution which may be dissected into three normal 
distributions, two of which possess equal areas and equal standard 
deviations and whose means are translated equal distances to 
left and right, respectively, of the mean of the third distribution. 

The equation of the frequency function which describes such 
a distribution is of the form 


( 1 ) 


V- 







where the areas of the components are connected by the relation 

Since the distribution is symmetrical with respect to the ^ - 
axis, both the median and mean fall at the .origin, and, if the 
approximation, = ^/(By^Ts) , is used, the standard deviations 
of the median and mean are readily expressed in terms of the' 
parameters of equation ( 1 ) : 


C = 






* y 

Z i/J . + e~^V 




To obtain conditions under which the. two averages will be 
equally stable, let the notation, 

<rr 




7t 



238 


RELATIVE STABILITY 


be employed and let the standard deviations of median and mean 
be set equal to each other^ whence wt oljtain the equation 

2 /> -f- i Cj, ("e" 




whicbj if we let 


mav be written 


(€. o,-y?cL Zy>Y/'^770-/J = 


Since positive real roots of equation 

(2) which are less than 0*5 are of interest. Independent of the 
positive value assigned to and n. , this equation may have not 
more than two real roots in this interval, for bothJ^(O) and 
/ ( 0 . 5 ) are less than zero when o . 

If equation (2) possesses two real, distinct, positive roots 
less than # then there will be a subinterva! of the interval 

)wdthin which the standard deviation of the arith- 
metic mean will exceed the standard deviation of the median, 
at the end points of which .the averages will be equally stable, 
and without which the standard deviation of the median will 
exceed that of the arithmetic' mean. If- the equation has no real 
roots in this interval, the standard deviation of the median will 
exceed that of the arithmetic mean throughout the interval. 

The tangents to the curve whose equation is (2) are hori- 
zontal when and when xc ^ . Since 

is negative for all values of -/^ other than zero, 
is a value of for which the standard deviation of the median 
is greater than the standard deviation of the arithmetic mean. 
If there 'exists an interval of values of for which the standard 
deviation of the arithmetic mean ' is greater than the standard 
deviation of the" median, it will contain the value 
'Therefore the condition that such an interval exist ■ is 



HARRY S, POLLARD 


239 


If the valiiCj lies in the interval (o{c^{o,3) 

and if f = o , then equation (2) will possess a 

double rootj and the standard deviation of the fnedian will equal 
the standard deviation of the arithmetic mean for a single value 
of 5 2tnd will exceed it for all other values 

of . The condition under which will vanish 

is that ^/pfu assume one of the values: 4.67284, — 1.53327, 
— 0.13957, for letting A » equation (2) becomes 

(^) - o ^ 

which may be written as a cubic equation in whose roots 

have the above values. 

2, The Dissection of a Frequency Distribution into Three Nor- 
mal Components, 

In order to apply the above conclusions in determining the 
relative stability of the averages of a particular sequence of eco- 
nomic data, it is necessary that the data l>e dissected into three 
normal distributions. A general method of determining the values 
of the five parameters of equation (1) from given frequency 
data will therefore be developed. 

Karl Pearson® has described a method for dissecting an 
asymmetrical, frequency curve into two normal curves. He obtains 
expressions for the first five moments of the curve, which he 
solves, after lengthy algebraic manipulation, for the parameters. 
A similar procedure, the solution of moment equations, may be 
applied to a dissection into three normal curves. However, since 
the distribution has been assumed to be symmetrical, expressions 
for the odd inomeiits vanish identically. Hence it is necessary 
to use moments as high as the eighth in order to obtain five equa- 
tions from which the values of the parameters may be determined. 
While Pearson’s method of setting up the moment equations ' may 
be used, his method of solution will not carry over to this ^ case., 

®Loc, cit , ' , ■ 



240 


RELATIVE STABILITY 


Given a frequency distribution of the variable x whose origin, 
has been chosen at the arithmetic mean of the distribution, let 
denote the f ^ moment of the distribution, and let be 
set equal to the corresponding moment of the theoretical distribu- 
tion whose equation is (1). We have, then, as the equations 
from which the five parameters of distribution ( 1 ) may be deter- 
mined : 

O) = / 

3^, (3 crV = /if 

>5^,0^ -i-z^2.C‘^ -I- ^‘‘)= Mi, 

ie5 -c, <A+2'<^ + +• 2 ? 4*02% -d- ) = Mg. 

Instead of carrying through the solution of these five equa- 
tions, it has been found convenient to assign to cjy a value equal 
to the standard deviation of a central group of items, and to 
retain only the first four moment equations to be solved for the 
other four parameters. Later the five equations will be used to 
correct this estimated value of cfj . 

If we let Ai^ denote the nf— moment of the given distribu- 
tion with as unit, denote by ul^ by /o , and elim- 

inate from these equations, we obtain 

O) -f- - Ml 

'ic.i ■f-C(-Xli)p^(3 + (,tL'+tC )- Hlf 

^ MI . 

Now eliminating C, , this system of equations reduces to 
"M O^MjJ f O-mDp (fst45icnsMfic) = mA/ 



HARRY S. POLLARD 


241 


and when the notation 

e ^ (^15 Ml- Mi) = 

is introduced and the equations are written in descending powers 
of they become 

C3t(^u^Laf)^/^ys Y ^ 0^ 

( 6 ) 

/o\ (/5 u}") -f € ^ O ^ 


Let Sylvester’s method of elimination be applied to these 
two equations, making use of the property that the resultant, 
, of the equations a.^x"i^c3L^x.\cL^x and 


IS 


/? = 




(a ^)” <2.^4 
-4 


“ ^34 


where (<x. 4,') denotes 

Let (ftii^) 6£.*J ^ )^£( u'j^ 65 f V5U f * 

'<r/, 

A y 

and expanding and simplifying this determinant we obtain 


R- 


=: O 


i - = <=> ■ 

Since , e are constants, this equation may be written 

HU,<- ii'-o- 

If, finally, j] , and are replaced by their values as functions 
of this equation reduces to 

(7) u!‘'(Ai-l3iC-i-Di-E-) + uJ^CzzA ■hH0 + /iC ■/■ 3cl>f n £) 

+ LLii59A+C.-7f3-i-‘^3C-t-3lBt>+-int) 

+ u} CH^^AS fl3XB+t9(i,C +! 3S^oDf BjiHt) 
i-uMCs55A^l-^3f3-i-t'i5CfX'l'l5D-f-3 5>^} 

-f U.’-CxiyaA 9 AUX.b) 

f iA 5 A B 9 >J i-fSC f AX5l> B ■zIb) = 


Dickson, L. E., First Cmtrse in Theory of Equations, p. 150. 



242 


RELATIVE STABILITY 


a sixth degree equation in tt^upon which the complete soltition 
of the problem now turns, for having obtained a value of 
from equation ( 7 ), the values of /o’" and may be determined 
from equations (6) and (5), respectively. Since , 

/? = ^67 » and f , the parameters of equation (1) 

may be obtained. 

It will be recalled that equation ( 7 ) has been obtained from 
the first four of equations ( 4 ), and that have employed this 

equation to obtain values of 'C, , S * ^ ^ corresponding to an 

assigned value of <7 . This estimated value of 7 may be ’ cor- 
rected, and corresponding corrections to the values of the other 
four parameters may be obtained, by use of the five equations (4), 

Let equations ( 4 ) be written 

M ^ ~ C ^ ^ y V 

Let cfL denote the values which the four parameters 

take on when <7 is assigned the value OJ . Let ^7 
^7 ^ denote the respective corrections which should be ap- 
plied. Then using Taylor's theorem and neglecting terms which 
contain derivatives of higher order than the first, we obtain five 
linear equations in the five corrections: 


The corrected values of the parameters, , 

7 f /icr; , 7% A 7 , may be regarded as second approx- 

imations to their true^ values, and further approximations may 
be obtained in the same fashion. 


IV. 

The Stanbarb Deviation of the Medians of Small Samples. 

I. The Classical Approximation to the Standard Deviation of 
the Median^ , 

" In the preceding sections an approximation to the standard 
deviation of ' the. median has been used, and the conclusions have 



HARRY S. POLLARD 


243 


been assumed to be valid only when s , the number of items in 
the sample, is large. We wish, in the present section, to examine 
this approximation, and to compare the results which it produces 
with those obtained when other methods of determining the 
standard deviation of the median are employed. 

The formula ordinarily used to compute the standard devia- 
tion of the medians of samples of 5 items each, drawn from a 
frequency distribution m^hose equation is and which sat- 

isfies the condition 


( 1 ) 

is 


J j-(x) d 

-oa 


X - 


aO 



O 


( 2 ) 


<r 




/ 


oo 


That this formula gives only an approximation to the true 
value of the standard deviation of the median and that the ap- 
proximation may be rather poor for distributions of certain types 
is clear from the following derivation of the formula. 

Let samples containing s items each be drawn from the 
distribution which satisfies condition (1). Let the pro- 

portion of items above o in eadi sample l)e denoted by 
( These observed values will tend to duster around 

0.5 as a mean, with a, standard deviation of — . Let. the 

2 

deviation of the median of a sample from the median of the 
theoretical distribution, x=c, be denoted bye. Then if the 
number of items in the sample is sufficiently large to justify us 
in assuming that d is so small that we may regard the element 
of the' frequency curve whose base is the interval and 

whose area is d, as . appro>ximately a rectangle, we may write 


d 


whence 




f(o) 


I 


Rietz, ' H. L., Mathematical Statistics, Caros Monograph III, p. 134. 
Yule, G.U.,Jntroductton to the Theory of Statistics, 8th ed., p, 337. 



244 


RELATIVE STABILITY 


This replacement of an element of a frequency curve by a 
rectangle can be Justified only whene,. the deviation of the 
median of the sample from the median of the theoretical distribu- 
tion, is small Hence there is reason to doubt whether formula 
(2) will give a close approximation to the value of the standard 
deviation. of the median of samples which do not contain a large 
number of items. The formula .would seem particularly untrust- 
worthy when applied to a theoretical distribution in which the 
median falls at a point of relatively small frequency. 

An expression for the standard deviation of the median 
which is not liable to the inaccuracies of approximation (2) may 
be derived as follows. Given* the frequency function 
which satisfies condition (1), if a sample of items is 

drawn from this distribution, the probability that an item will 
fall in the interval (x , x^cLx) approaches the limit ^ dx dx 
approaches zero, the probability that an item w^ll fall below x 

f<^) the probability that an item will fall above 

«0> 

X is . Hence the limit, as dx approaches zero, of 

the probability that the median of the sample will fall in the 
interval (x , ja-dic) is 

^ TTL ^ 

and the square of the standard deviation of the median may be 

obtained from the equation 



The integrations involved in this equation may be difficult 
to perform unless is a simple function. Hence, we consider 
the rectangular distribution whose equations are 

and obtain 

(T^ = r jx C‘>.25-x) e^K^ ' ■ . 



HARRY S. POLLARD 


245 


If we denote by (T^ the approxitnation to the value of the 
standard deviation of the median obtained using formula (2), 

I i f 

we have for this distribution (T = — — = — j 

whence we have the relation ^ _ <f ' f^JZtL 

^ ^ V ;277f3 ' 

It is observed that the approximation, , exceeds the true 
'value, , for all values of ?? , but that the error factor approaches 
unity as tz. increases, and is close to unity even for fairly small 
values ofn. 


2. A General Method of Obtaining Upper and Longer Limits 
of the Standard Deviation of the Median, 

For distributions composed of two normal components the 
integrations involved in equation (3) can be performed only ap- 
proximately, and this equation will serve only to determine upper 
and lower limits of the true value of the standard deviation of 
the median. A more straightforward method of obtaining these 
upper and lower limits, and one which is applicable to any 
frequency distribution, wall be followed. 

Let denote the deviation of the i ^ percentile of a dis- 
tribution from the median, and let denote the probability that 
the median of a sample of 3 items wall fall between the i 
and Ct-tf] percentiles of the distribution from which the sam- 
ples are drawn. Then a lower limit of the standard deviation of 
the medians of samples containing s items drawn from this dis- 
tribution is given by the expression . 

L ^ ^ M, ■’ 

and an upper limit, by the expression 

I Z X- -pc ^ ^ fi:} 7 

where, in the case of a distribution in which the zeroth or- hun- 
dredth percentile is at an infinite distance from the median. 



246 


RELATIVE STABILITY 


denotes the largest value of x for which it is true that 
< e , and denotes the smallest value of for 
which it is true that f < tc) doc e , where ^ is an arbitrarily 
small positive constant. 

The values of depend on the distribution, and are inde- 
pendent of the number of items in the sample. The values of 

depend on the number of items in the sample and are in- 
dependent of the form of the distribution. Approximations to 
the values of , ( d = ^ ^ obtained by use 
of the DeMoivre-Laplace theorem^-. In our notation the theorem 
may be stated: 

The probability that yn or more of the items of a sample 
containing (j?7r7~/) items will fall to the right of the i ^ per- 
centile of the distribution from which the sample is drawn is 

TL e where H = - • 

<- J \fz7r sj (>oi L)Cf-.oi c) 

Then , 

Tables^® of values of for samples containing 7 and 51 
items have been computed, and have been used in calculating 
upfier and lower limits of the standard deviation of the medians 
of samples containing 7 and 51 items drawn from the distributions 
w'hose equations are 

T, C>c.) = ^ e 

L = — p=r- (e ^ + e "*■ ), 

J /j/r ^ 

4 f-) - 1^'^' ' ^ ‘ J. 

' Rictz, H. L,, Mathcnmtkal Sfatisiics, Cams Alonograpb III, p. 35. 

/ "'‘These tables are irjclwied in the author's dissertation, which is hied 
in„1he Jihrary of the ' University of 'Wisconsin. . 



HARRY S, POLLARD 


247 


Approximations to the value of the standard deviation of the 
median have also been obtained using formula (2). The results 
are tabulated below. 


Staxdard Deviation of the Median 



7 777 S. 

5/ 

Pf-fTTlS. 



upper 

Loziecr 

Formula 

upper 

Lower 

Formula 


Limit 

Limit 

(2) 

Limit 

Limit 

(2) 


0.471 S 

0.4462 

0.4737 

0.1838 

0.1631 

0.1755 


1.4701 

1.4114 

3.5003 

0.8551 

0.7733 

1.2968 

i 

0.2683 

02521 

0.2368 

0.093S i 

0.0831 

0.0877 


We conclude that, when applied to samples containing a 
fairly small number of items, the results obtained using the cus- 
tomary formula for the standard deviation of the median may 
be very untrustworthy, particularly for a distribution in which 
the median falls at a point of relatively small frequency, 

We therefore shall propose another method for the comparison 
of the stability of the arithmetic mean and median, one which 
does not involve the computation of the standard deviation of 
these averages. 


V. 

The Relative Stability of the Median and Arithmetic 
Mean, Determined from the Frequency Distributions 
OF These Averages. 

i. The frequency Distributions of the Median and Mem. 

Since the true value of the standard deviation of medians of 
samples containing items, drawn from a frequency distribution 
which is composed of two normal distributions, is not easily 
determinable, and since the customary approximation is not suffi- 
ciently accurate to justify its use in the study of small samples 
drawn from a distribution of this type, we shall develop a method 
of comparing the relative stability of the median and arithmetic 
mean, based not on the standard deviations of these two averages 
but on their frequency distributions. 



248 


RELATIVE STABILITY 


Another consideration, aside from expediency, motivates the 
development of this method, for even if the standard deviations 
of the arithmetic mean and median could be accurately computed, 
they would not determine the relative stability of the two averages 
unless it is assumed that the frequencies of the mean and median 
are distributed in the same fashion. If, liovrever, the equations 
of the frequency curves of the mean and median of samples of 
jS items drawn from a given distribution are determined, then 
by comparing the deviations from the median of corresponding 
percentiles of these two averages, a judgment as to the relative 
stability of the two averages may be formed. 

We shall assume the frequency curve of the arithmetic means 
of samples of {zn-hf ) items to be normal, independent of the 
form of the theoretical distribution from which the samples are 
drawn, and to possess a standard deviation of ^ , 
where (f is the standard deviation of the theoretical distribu- 
tion.’'^ We proceed to determine the equation of the frequency 
curve of the medians of these samples. 

Let the equation of the original distribution be /cTxj , 


and let the condition 


o 0(7 

f d7 


he satisfied. Then the probability that the median of a sample 
of ( j2-77f"/ ) items will fall in the interval ( ) is the 

product of the probabilities that an item will fall in this interval 
and that of the remaining items, ti will fall above this interval 
and n below this interval. We let denote the frequency func- 
tion according to which the medians of the samples are distributed, 
and obtain r x. ^ n 

( 1 ) - O 


MtiifienujlH'tif' Statistics, Cants ’Monograph HI, p, 127* 
fpr the' prohainlity density of the median is 



HARRY S. POLLARD 


249 


2. The Stability of an Average Determined from Its Probable 

Error, 

Expressions for the frequency functions of the median and 
mean of samples containing (2771*-/ ) items having been deter- 
mined, we may form a judgement as^to the relative stability of 
these two averages for a given distribution by comparing the 
deviations, from the median of the given distribution, of cor- 
responding percentiles of the two averages. If some definite 
criterion of relative stability is desired, it seems natural to select 
the probable errors of the averages, where the term (probable 
error) is understood to have its original meaning, and not to 
denote a fixed multiple of the standard deviation of the average. 
We shall therefore proceed to determine the deviation, from the 
median, of a given percentile of the frequency distribution of 
medians of samples containing (277/-/ ) items drawn from the 
distribution whose equation is . 


found in a paper by E. L. Dodd (Functions of Measurements under General 
Laws of Error, Skandinavtsk Aktuarietidskrifi, 1922, p. ISO), and is there 
used in comparing the relative stability of the median and arithmetic mean 
of certain theoretical frequency distributions. However, the method used 
in Dodd’s paper is to compare the probability densities of the two averages 
at the median of the original distribution, rather than to compare the 
deviations from the median of specific percentiles of the frequency curves 
of the two averages, as we shall do. Dodd uses Stirling’s formula to obtain 
an approximation to the probability density at the median, 



and represents the probability, density of the arithmetic mean at the same 
point by' the expression ___ 


where if _ is the standard deviation of the original distribution. 

It is , readily' seen that this method of comparison, when applied to 
small samples, would lead to exactly the same inaccuracies that would 
result if the relative stability of the two averages were determined by 
comparing their standard deviations, ' the customary approximation formula 
being used to obtain the value of the standard deviation of the median, ■ since , 

izOnH) ^ Lk. . 

y T * ' (fy/Sr \J37H-I < 5 ^, 




fz(Z77H} 

T 


po} -r 



250 


RELATIVE STABILITY 


' Let 5 denote that fraction of the area under the frequency 
curve of the medians which is bounded by ordinates drawn to 
the curve at the points and € , Our problem, is to 

determine the value of i which corresponds to an assigned value 
of 5 j, and from (1) the relationship between and S is seen 
to be expressible in the form 

a 

-C* y, JJL 

(2) 5 = o vh) //^x; I ^ - [//fx; J^j jdx-, 

where 5 may be assigned any value in the interval (a < 5 ^ 

If the transformation ^ 

be applied to equation (2) it becomes 

(3) 3 - f C/-t3 dt 'where V corresponds to 

^ 2> '^7 J 7 

A O 

Then 


( 4 ) 


^ 2->7H 

5 2 




S’ z 


awf-/ 2. 

* C-^!) 


(j.-nH)! 



- _ 2L ^ T7M -riin-iyh-t) _^ 7 . +. 

^ ^ Trr 3/. 7 ^ “ zw/ ^ 

and using Stirling's approximation, we obtain 

(5) IILELl = fo-t3^dt 

CZ-T^f-l) 0 


. »L ^3 _ Tn(n~i)(n-x) .7^ ^ 

3 zl^ s 3 /' 7 , ^ ^ 


It is observed from equation (3) that, fot' a fixed value of 
7t is a monotone' increasing function of S , and that 
whe'n'S^<^', and when . It ns also' .observed ' that, 

for a fixed value of S' , is, a mpno'tone^ decreasing function of 



HARRY S, POLLARD 


251 


> 1 / , and that ^ = 

Unless S is assigned a value near 0.5, we may obtain an 
approximation to the value of ^ from equation (5) by neglecting 
terms containing powers of higher than the third, e wdsh 
to determine the degree of approximation which is introduced 
by dropping terms after the second from the second member of 
equation (5). To this end, we shall first ascertain an interval 
of values of 3 for which it is true that ©c is not greater than 
the simple, decreasing function of 7^^ Y {i^ ; that is, we shall 

detemiine the interval of values of 3 which satisfy the in- 
equality_^ 

25 ^ j 22 ^ 77 6?-/ J / _ n(r?-iX7?-2.) / ^ ^ 

2.T7f-r ^ ^ z! s zp-iri 3 / 7 (Z77H)n^fn'^ 

or 


Zn-f-i 


. t — 1 

L ^ ;£/572. 3 / 777 ^ <Z?7tOT?'^J ^ 


5 ^ 

zr}\^ 

The second member of this inequality is greater than 


/ 


i 


Tx-f 


( n-iX’ti-x) 


j. 


57V 5! 7 ' CZ7H-i) 71^ 

and since the terms of the finite alternating series within paren- 
theses obviously decrease in numerical value, their sum will exceed 
2/3 for all positive values of 72.- Therefore the inequality 


f 


will certainly be satisfied for all values of 5 in the interval 
|S| ^ = ^-376/, 

and therefore c/ is not greater than Y when S is assigned 
a value corresponding to a percentile of the frequency distribution 
of the medians between the 13th. and 87th percentiles. Certainly 
in determining the first and third quartiles of the frequency dis- 
tribution of the medians, H will be less than ‘ \j . 

Since in equation (5) the value of 5 is given by a finite 
alternating series whose terms do not increase in numerical value, 



252 


RELATIVE STABILITY 


the error involved in neglecting terms after the second will be 
less than the first term neglected ; 

g77-f/ -nCn-t) 5 
Z zi 5 ^ ' 

But, when’ |5|£ o.37(o! ? is true that 


2 77-f-f n(?7-f) 5 < 

— Y — 

;2 irTz x! 5 


7 77// 

'n(n-i) 

/ 

2 77^-??-/ 

7V^ 

Xi5 


zo 

_L_ / 


‘ 

rr O, ^ 

10 'fw ' 

^?7 

^ 77 V 10 fr 



We conclude, therefore, that the value of or" obtained from equa- 
tion (5) by neglecting powers of V higher than the third cor- 
responds to a percentile of the frequency distribution of the 
median which differs from, the assigned value of S by not more 
than 0.05. 

We see, then, that an approximation to the value, t?, of a 
given percentile of the frequency distribution of the medians of 
samples containing {zix’ti ) items drawm from the theoretical 
distribution whose equation is y = f(x.) may be obtained by solv- 
ing for of the third degree equation 


( 6 ) ^ 

where 



o 


The tables of values of mentioned in the preceding sec- 
tion afford a check on the accuracy of the results of equation 
(6) when ) is assigned the values 7 and 51. From these 

tables it is observed that the third quartile of the frequency dis- 
'tribiition of the medians of samples containing 7 items falls at 
the 62nd percentile , of the. . theoretical distribution from which 
the samples are drawm,, and that for samples containing 51 items 
the third quartile of the medians falls near the 55th percentile 
: of, the, original distribution. Hence, when 5- u. the value of' 
,, accurate to two-places of decimals, is.^'042 when 



HARRY S, POLLARD 


253 


(^77f-/ ) equals 7, and 0.05 when {^77t/) equals 5L 
In equation (6) let 


K ^ 


2 5 ^ 

2vtf 


whence we obtain 

Letting <=<' i< -t ^ , this equation becomes 


~ ( lf^-h3K^X t 3 K ^ O y 

or as an approximation ^ 

At 

A —- yr - 

/-7Z R 

Assigning ton, the values 7 and 51 we obtain the following 
results : 


JTiff /< A 

7 .2193 .0123 2316 

51 .0869 .0072 .0941 


.1158 

.0471 


Computed 


Value 



.12 

.05 


Thus we have developed a method of determining the prob- 
able error (or any percentile) of the median, which possesses th¥" 
double advantage of being applicable to distributions which do 
not contain a very large number of items, and of being applied 
easily to any distribution, for after (6) has been used to determine 
the value of o(" , the corresponding value of may be obtained 
either from a table of integrals of a theoretical frequency func- 
tion, or from an actual distribution by cumulating frequencies 
beyond the median. 

The calculation of the probable error (or any percentile) 
of the arithmetic mean offers no difficulty if we assume the means 
of samples to be normally distributed. The relative stability of 
the two averages may then be determined' by comparing the 
probable errors (or corresponding percentiles) of the two aver- 
ages. This method of comparison will be applied to a particular 
distribution in section VI of 'this paper. 



254 


RELATIVE STABILITY 


A of values of and k for certain assigned values 

of 71 and 5 is given below. 

Values of and k when k = — = jf 

:2 ntl c 


TV 

s 

K 


/ 


HHRBHII 

.02669 

.027 


.10 


.054 



.15 


.082 



.20 


.111 

\0 -! 


.25 

.13345 

.143 



,16014 

.177 



.35 

.18683 

.217 




.21352 ' 

.265 

\ 


.45 

.24021 

.334 



.05 

.017377 

.017 


HI 


.034754 

.035 


Hi 

.15 

.052131 

.053 

25 " 


.20 

.069508 

.073 

H 

.30 

.104262 

.116 



.35 

.121639 

,143 



.40 

.139016 

.176 


H 

.45 

.156393 

.223 


m 

,05 

.012409 

.012 ■ 



.10 

.024818 

.025 

1 


.15 

.037227 

.038 



.20 

.049636 

,052 

50 4 


.25 ■ 

.062045 

.067 



.30 

,074454 

,083 



.35 


,102 



.40 

,099272 

.126 


L 

.45 

.111681 

.161 


f 

.05 

.008818 

,0088 



.10 

.017636 

.018 



.15 

.026455 

.027 



,20 

.035273 

,037 

IG^ i 


,25 


,047 


.30 

.052909 

.059 



.35 ' 

.061727 

■ ,073 



.40 

.070546 

.090 


V ' 

.45 

.079364 

,115 


^OTOpiited bj^ Misa' Beatrice’ Berberich, university; cemi^uter,, Uwer-^ 
of Wisconsiis-, , , "/ /, 












HARRY S, POLLARD 


2SS 


VL 

An Application to a Particular Sequence of Economic 
Data of Various Methods of Comparing the Stability 
OF THE x\ritHMETIC MeAN AND MeDIAN, 

1, Dissection into a Symmetrical Distribution Composed of Two 
Normal Distributions. 

In a paper by W. L. Crum^^ a particular sequence of economic 
data has been examined with the purpose of determining the 
relative stability of its median and arithmetic mean. The series 
studied comprises the monthly link relatives of the rate of interest 
on 60-90 day coHimercial paper from January, 1890, to January, 
1917. A frequency distribution of deviations from their medians 
of the link relatives for each month is. reproduced below, together 
with the values of the first six moments of the distribution. 


Frequencies of Deviations from the Medians 




Dev. 

gg|| 

Dev. 

Freq. 

Dev. 

Freq. 


Freq 

--37 

1 

—18 

2 

—7 

6 

4 

19 

15 

1 

—32 

1 

—17 

2 

—6 

23 

5 

13 

16 

1 

—30 

1 

—16 

2 

—5 


6 

13 

17 

1 

—29 

1 

—15 

1 

—4 

13 

7 

8 

18 

2 

—28 

1 

—14 

3 

—3 

19 

8 

6 

23 

1 

—24 

1 

—13 ! 

6 

—2 

9 

9 

S’ 

24 

1 

—23 

1 

—12 ’ 

3 

—1 

11 

10 

2 

28 

1 

—22 • 

! 

-n 

6 

0 

28 

11 

4 

34 

1 

—21 

2 


3 

1 

22 

12 

3 ! 

35' i 

2 

—20 

1 

9 

5 

2 

22 

13 

1 i 

41 ; 

2 

— !9 

2 

— 8 

11 

3 

13 

14 

2 ’ 

42 ■ 

1 









45 ; 

1 


/Y= 


Moments 

About 0 Deviation 

About X 

With Sheppard Adjustments 

1 

—0.46 

1 0.00 

aoo 

2 

107.06 

106.85 

106.77 , 

3 

656.45 

793.68 j 

793M 

4 

83520 

84860 1 

84465. 

5 ' 

3015000 ■' : 

3209000 

32l«l)00 ' 

6 ' 

110890000 

119480000 

11937000) 


'**Loc, ciL 















256 


RELATIVE STABILITY 


Professor Crum’s method of attack is to dissect the scries, 
according to Pearson’s method, into two normal com|X«ie!ite 
whose means are coincident. He therefore fits to the data a curve 
whose equation is 


( 1 ) 


3Z^ f Cf 


X 




and obtains for the parameters the values: 


A- cr. //:? * 

This theoretical distribution is of the type discussed in paragraph 
3, section IL Its median and mean will be equally stable if 
jO r satisfies the equation 

S-> i-2. /^-h c,c.^^ a . 

Letting C^=AZ5and this equation reduces to 

which has a root between 2.5 and 2,6. Since for the distribution 
under consideration - V. ? , the standard deviation of the arith- 
metic mean is larger than the standard deviation of the median, 
and the median is the more stable average. 


2. Dissection into an Asymmetrical Distribution Composed of 
Two Normal Distributions. 

In the method of dissection employed by Professor Crum, 
the slight positive skewness which the distribution possesses is 
ignored. We shall dissect the data into two normal components 
whose means are not equal, and investigate the relative stability 
of the median and mean of the resulting asymmetrical distribution : 


( 2 ) 




I 


Pearson s method of dissecting an asymmetrical distribution 
depends Qit the solution of his ‘'fundamental nonic/’ 





HARRY S. POLLARD 


257 


f'h. 

in which denotes the i moment of the given distribution, 

and ^ 3 ^^ . 

A value of /ia, having been obtained from this equation, the 
parameters of equation ( 2 ) are determined by solving, succes- 
sively, the equations 

equation are 

denoted 

^< /(-«',- A), 

^ ’ y^^-i " f ^ ^ A . 

The calculation of the Sturm's functions of the fundamental 
nonic shows it to have three real roots, two between 0 and — 100 , 
and a third between 200 and 300. The values of these roots are 
found to be . ■ 

^^^-•55517, ZiO. 

However, the use of the second and third of these roots leads to 
imaginary values of certain of the parameters of equation '( 2 )» 
and they are therefore rejected. Using the root, 7 ^= -5,^517' , 
the parameters of equation ( 2 ) are found to have the following 
values : 


41^ = o.<)&37^ 


(r= 9.0Z 

4;^ » 0.0 36,3^ 

iz.20^ 

c^; = 25.07 



258 


RELATIVE STABILITY 


3. Dissection into a Symmetrical Distribution Composed of Three 
Normal Distributions, 

Finally, let the given data be fitted by a frequency curve 
whose equation is 


( 3 ) 


32.4 f-C, 'ior*- *<5 






+ e 


Cx+fi-J 


where the' origin is selected at the arithmetic mean of the original 
series. The dissection depends on the solution of equation (7), 
section III, which for the distribution under consideration has^ 
the form 


tiz to g 

U, -5^,35$ LL 


2‘I3.^ZZIL tt '’3.l6iuC'^ a.5Z7 - 


The only positive root of this equation is “ 6^. id. 

Solving, successively, equations (6) and (5) of section III, the 
values of the parameters of equation (3) are found to be 

X ^ <r- ^ (T^ 

I J } ^ 3i - rfiA J 

The accompanying figure shows the original distribution 
(grouped into class intervals of five units) and the curves obtained 
by each of the three methods of dissection, plotted on the same 
set of axes. It appears from the figure that 'a distribution of 
type (3) fits the data more closely than either of the other curves. 
This fact may be checked by comparing the sums of the squares 
of the differences between the actual and theoretical frequency 
of each class. The values of these sums of squares of deviations 
from ' theoretical distributions (1), (2), (3) are found to 'be, 
respectively, 68.250, 51.604, 19.435. 

4, — Retatim Stability of Median and Mean for Each of the Meth- 
ods of Dissection, 

Wc' turn now to the problem of comparing the stability of 
the median and mean of the three theoretical frequency functions 



HARRY S. POLLARD 


2S9 



260 


RELATIVE STABILITY 


obtained by dissection. Since N-32i is fairly large, and since 
the median of each of the theoretical distributions is located at 
a point of relatively large frequency, we shall use the approx- 
imation to the standard deviation of the median, 

where 5^ is the ordinate at the median- 

For Crum's dissection into two normal distrihutions, the 
arithmetic mean and median are both at the origin. Hence 



which verifies his conclusion that the median is more stable than 
the arithmetic mean. 

For the asymmetrical dissection into two normal curves^ 






- 0.3 J . 


A/ is a root of the equation 

rt 

r\- 


and its value, obtained ' by interpolation ini' a table of values of 

the integral Je~^t , is Hence 


vsr- \<r,^ ^ / 

For this method of dissection the arithmetic mean is more stable 
than the median. This was to be expected, since the second com- 
ponent contains so small a fraction of the total area that the 
compoimd curve ditfers little from a single normal curve. The 




HARRY S, POLLARD 


261 


figure shows that this curve would not naturally be chosen to 
represent the given data. 

For the symmetrica! three normal curve dissection, 


= 0 , 5 ^. 


= O. 5 5. 


A- ' • <57 ^ A C £ 

This method of dissection bears out Crum’s conclusion that the 
median of the original series is a more stable average than the' 
arithmetic mean, although the difference between the standard 
deviations of the two averages obtained by this method is con- 
siderably smaller than that obtained by the first method of dis- 
section. 


5. The Probable Errors of the Median and Mean, Deienntncd 
from the Frequency Distributions of these Averages, 

The above discussion has been concerned with certain the- 
oretical frequency curves, rather than with the actual data which 
these curves are intended to fit. We shall now compare the 
relative stability of the mean and median by the method of section 
V, which does not involve a fitting to the data of a theoretical 
frequency curve. 

The method of determining the quartiles of the frequency 
distribution of the median, developed in section V, assumed the 
number of items in the sample to be odd. We therefore solve 
the equations 

2 n-f-i , 

tising the values 323 and 325 for {/-nt! ) and obtain roota 
K~ 0..0342 ^ ^ = a.aojLJ 



262 


RELATIVE STABILITY 


ill each case^ whence 

X 

and J /(^) cL o, o i f . 

o 

For the given distribution, the median falls at zxto deviation. 
The fourteen items in the upper half of the zero class comprise 
4.32^1', of the entire frequency distribution. Hence the third 
qiiartile of the distribution of the medians has a value 

if = J-. - . - 0.Z176,. 

^ 0.043 2. 

Similar reasoning shows the value of the first quartile of t!ie 
medians to be — 0.2176, whence the semi-interquartile range is 
0.2176. 

X. 

Since (T = the probable error of the arithmetic, 

mean has a value 

Thus the median is again shown to be more stable than the 
arithmetic mean. 

Harry S. Poluard, 

Miami University, 

Oxford, Ohio. 



AN APPLICATION OF CHARACTEKISTIC FUNC^ 
TIONS TO THE DISTRIBUTION PROBLEM 


OF STATISTICS* ' 

By 

Solomon Kullback, 

George Washington University, Washington, D. C* 

CONTENTS 

Part I section 

Introdtiction I 

Characteristic Functions II 

Theorems Regarding a Single Function^ HI 

Theorems Regarding Several Functions, IV 

Part II 

Distribution of the Arithmetic Mean V 

Distribution of the Geometric 'Mean VI 

Lemma VII 

Distribution of Variance of a Sample of From a 

Normal Population VIII 

Distribution of the % of Goodness of Fit Test IX 

Simultaneous Distribution of Variances and Correlation 
Coefficient of a Sample of tl from a Bi»variate Nor- 
mal Population X 

Distribution of the Covariance of a Sample of tl from 

a Bi-variate Normal Population XI 

Do N Samples of categories, come from the Same 

rL- variate Normal Population? XII 

Distribution of the Generalized Variance of a Sample of 

Af from an ?t^- variate Normal Population XIII 

Part III 

Summary and Conclusions XIV 


♦ Presented to the American Mathematical Society, under the title, ‘An 
Application of Characteristic Functions to Statistics,” Fek 25, 1933. This 
paper was; prepared tinder the guidance of Professor F. M. V¥eida, ' 



264 CHARACTERISTIC FUNCTIONS AND DISTRIB UTI ON 


PART 1 

The General Theory 

L Introduction:^ By the distribution problem of statistics 
we mean the problem of determining the distribution law of func- 
tions of variables satisfying known distribution laws. Many par- 
ticular problems of this nature have been solved by various meth- 
ods. In Part 1 of this paper we develop a general solution for 
this problem for functions of variables satisfying continuous dis- 
tribution laws. The general result is then applied in Part 2 to 
derive the distribution laws of several functions whose distribution 
laws have been derived by other methods and of some functions 
whose distribution laws have not been given or given only for 
special cases ; in Part 3 we summarize the results. The method of 
solution is related to the concept of characteristic function. “ 

The theory of characteristic functions is essentially a deveh 
opment of Laplace's^^ 'Tonction generatrice.^' In this paper we 
shall adopt the term characteristic function, although the same 
concept has been termed generating function^^ and reciprocal func- 
tion.^^ Poisson employed the methods of Laplace to discuss, 
in particular, ‘*Sur la Probabilite des Resultats Moyens des Ob- 
servations.’^ Cauchy^ was apparently the next to study and apply 
this theory; he applied the basic concept of characteristic func- 
tion in connection with, what he called ^'coefficient limitateur ou 
ristricteur” to study the problem of a function of errors. In par- 
ticular he studied the case of a linear function of the errors. 
More recently the same concept has been reintroduced under the 
name of cha,racteristic function by Poincare^^ ' and also by P. 
L^yyiT, 18, 18 employs it to consider the composition of laws 
of probability, the notion of the limit of a probability law, the idea 
“ of stable and semi-stable laws, etc. 

. ' In a series of papers, C. V. L. Charlier® further applied and 

reference numbers corres^pond. witfi^ the number of the item ip 
the bibliography. , 



SOLOMON KULLBACK 


265 


developed the theory of characteristic functions (though he em- 
ployed the terminology of reciprocal functions) to develop the 
Gram-Charlier Type A and Type B series, and to consider the 
distribution law of functions of variables satisfying general fre- 
cfuency laws. Under the name of “Erzeugenden FunktioiiT T. 
Kameda^^ studied the properties of functions which are intimately 
related to characteristic functions. In particular, he discussed the 
development of a function as a series of Hermite Polynomials and 
also considered the problem of finding the distribution law of a 
function of variables obeying general distribution laws.^"" 

//. Characteristic Functions: By the characteristic function 
of the distribution law of the variable x is meant the mean* value 

X , . 

of e where ^ . Thus, for a continuous distribution 

law, if is the probability to within infinitesimals of a 

higher order that x- <( X, < x f- and f(t) is the 

characteristic function of the of x then 

( 1 ) cfd) - f e" ■ fcx) 


where the limits of the integral depend upon the range of appHca- 

/ iLx 

e ^(xyAx. if we 

— t>c> 

agree that -fc^y^o outside the range of applicability. The 
characteristic function derives its importance from the fact^" that 

OC7 

(2) 

— oo 

Fot the case of several variables, we have that the character- 

^ AIso^ known as probable or expected valise. 

We shall designate distrihution law hereafter by cll 



266 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 

istic function of the d.l. J 

is given by 


(3) 






where is the region of applicability o£ •, ■x„). We 

may also write f 


provided we agree that f ^x> ‘ ' i^n) ~ ^ outside the 
region f( . As for the case of a single variable we have here too*® 


( 4 ) 


o:? 


ttur 






We shall prove that the following extensions are also possible. 
Consider the function uCx.^^ x,/ ■, Xy,} * of the variables 

the characteristic function of the d.l. of co is gfiven by 


( 5 ) fct)=f'f^ 


ii. • 


'V ^n) 

■A> 


■V 


• dx,-d.x^ -c^x.^ 


where /f is the region of applicability of ^ ri x, ^ , 

The d.l. of a , F^u) , is given by 


( 6 ) 




where 


is defined by (5). 

livwe consider the, several ftmctiona ^r«.) » 

♦'The ’iiopditions, which satisfy will he develoj^d 

:'ttirth€r in, this' paper* 



SOLOMON KULLBACK 


267 


xj ; . . u-^ ^ ,x„) of the 

variables x, , x^, ,x„ whose d.l. is /cx„ 

then the characteristic function of the d.l. of 

is given by 


R 


i. 1 4i/x„ . .^ x^)+ . . • f t Cx 

• /fx,,X.,y, x^)cLx,^‘lx^,--cLx^ 


where is the region of applicability of The 

d.l. of is given by 


OQ oO 


(8) 'P<r^„ 



w 


where ^ ‘ ' >4*.) defined by (7). 

III. Theorems Regarding a Single Function ^ ‘ • 

We shall now justify our statements and determine the precise 
conditions the function ll must obey. 

Consider the function of the variables 

JC, , satisfying the continuous d.l. / ■ • ■ ■ _, x.^) 


such that 


S_J 

R 




■dx =/ 


The 


function u. may have at most a denumerable infinity of discon- 
tinuities. The probability that <x * • ' - satisfies the 

conditions 


(9) < iX < u.^, given by 


(10) where A 

A 



268 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


is the region defined by the inequalities .u./, . To avoid 

the difficulty of integrating over the region A we shall avail our- 
selves of the discontinuity factor (See Whittaker and Watson^® 

§ 9 . 7 ). 


( 11 ) 




-it 

J B cLt ^ 


where /" - / 


/oa F tL^ i LCj F^O U. ~ 

We are now able to say that the required probability is given* 
by 


( 12 ) 






<£ X 


If we set 2. c*j = U-, +• ctj and 'Y ~ , the required 

probability may also be written as 




oo -X 
1 - 


Integrating with respect to (9 , we obtain 

it {a-oj) 

I dx-d^^- I € 

H 


OO 

( 14 ) -^Ff T(^o-xJdx,-.d^„- 


Zs/1% ~ 


dt. 


( 15 ) 


We now want to prove that 


f^d-X /e 

jy J 


^ ^ dt 

t 


oo 

/ \ , Oft f l\ 


zdX. 


where we write 2 - x^y • x^) ; dX = dx,, dx^^- - dx^ 


♦'This method is essentially an application of Caiichy*s XoefEcicnt 
limitateitr ou'ristricteur/' See CR. Vob 37, p. lS0ff,'and Whittaker and 
'^RoMmon, Calcttius of Obs., p. 169. ■ ' ' 



SOLOMON KULLBACK 


269 


and as the imultipie integral over the region , 
We have that 


( 16 ) 


-osst ^ J f: 

/ oo ^ 




Or t C±(U--Us) 

J—c dt 

itCil-ta} 


d-t , 


(17) 


We will now prove that 

J^cix 


"71 f 2 ^ JJ. f , 

J — -i' dX 


For this, it is sufficient^- to prove the existence of the (nn) 
fold integral** e dj di , 

and the existence of the right-hand member of (17). 

Consider** the rectangular region Gr in (rtr)) fold space 
defined by 0 ^ t ^ t j X; « X- ~ X,* ■ J - A Z 3, 

- ' ^ if ^ 7 & ) 

where we shall designate the region ^ Xy 
by E . Then, over G the multiple integral of 

^ s,Vi ^ 

? e 

exists since the integrand is bounded and has at most a denumer- 
able infinity of singularities (those of * . * - Then^*^ 

Zs^'^ tt <:«-*«) fzs/'n^ f 

J i —Z — e 


(18) 


Now for any positive € there exists a )> o such that 


(19) 


U ^7 ^ /* 

^ 7 £- oLX < 


* For the sake of convenience we shall understand a single integral 
sign to represent a multiple integral where, necessary. 

♦*The proof here given is modeled after a similar one of E, Podd 
(See Annals of “Math. 2nd $. Vok 27, pp. 12-20). ' 



270 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


I - I I A \ . 

e j =jj^^ I == / 


( 20 ) 



o . Furthermore, 

J tr 

0 



t, 

1 A, - eiit’ i-t («■-«') 


■ ■x't 

1 d± 

£ 

/ 

W ^ 


i ^ 


< V 


and since j I- dX = / , we can find a rectangular region B , 
such that if E encloses ^ arid E^ is that portion of £ not in £, , 


( 21 ) 


^ ^■<^x I <1- 


Thus 



d X 



LtCu-td) 

e d-t 



Hence, since and £ may now increase without limit (19) 
and (22) show the convergence of the (n+i) fold integral of 

aWw ^ i±(u-Lu) 
a. ^ e 


But since 


It 


exists for all values of t . .Therefore, 


It 




J f 




cL X 


exists being equal to the corresponding multiple integral whose 
existence has just been proved. We have thus established (17) 
by using the theorem that if the multiple integral and a corre- 
sponding iterated integral both exist they are equal. 



SOLOMON KULLBACK 


271 


We can show in a similar manner that 




O 


duf- 


r 7 ■ r 

d-ir = j e ■ 2- ■ 


so that finally 


(23) 




„ ^ ct(u-cu) 

e cLt 




i 


^ f tt(a-<M) 

i-dt\ie dX. 


Let u., and approach an intermediate value v as a limit 
with- . u -2 > . Then dv and co -> v" and in the limit 


(24)„. 


Of 

T^(v) dv = — /f 
zr J 


„ . ^Lt\f f 

^ e cC^ & ■ 2 ■ ^ 2; . 


7^ Cv) exists since 


and 


j _L fz sfk^ < — f 
|;iW'7 — 2-3:-e dt\-= rrj- 


e 


di =/. 


Therefore, to within infinitesimals of a higher order, the d.l. of 

CK? ^ 

^ is given by ^ fe‘'d± /e * X. 

_oo " 1 ^ 


<50 


X3 / / -ti'V ^ LtuC*,,",^y 

or (23) iLV)-J^j€. fit)dt where ^\ = J <£ ^ d-X 

-OO 

An application of Fourier’s Integral Theorem®* to (25) yields 
finally > 

o'h J * J 5 ^) 

. • 2 - d%^ 


(26) 


ttv r 

dv = J e 


where 7fv)=0 outside the range of applicability. 



272 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


From (26) we see that ^(t) is the characteristic fiiiiction 
of the d.l. of It ^ . 

We now state 

♦Theorem L If is amy function zMch 

may have at most a denumerable infinity of discontinuities^ of the 
variables where the distribution law of 

is given by jX^) which is on a certain tz- dimensional 

manifold /? a single valued, non-negative continuous function 


such that 




dx then the 


characteristic function of the distribution law of it is given by 
r etc tx 

- 

♦♦Theorem IL Under the conditions of Theorem I, the dis- 
tribution law of tt is given by 

«0 i,fc Li, 

7^(0,)=-" f e ^Ct) ett where 

•^OO 



tdz CL 


IV, Theorems Regarding Set^eral Functions 


gi- ^ : The procedure in the case where we consider 

several functions / = 6 ^7 “ ^ variables 

Chari ier® (Arkiv. Vol, 8) considers a function cc 
which may not be infinite for real Xy nor may the maxima and minima 
of u. be infinitely dense for any values of the variables^ 

Kameda^® (Proc. VoL 9) considers a function cl(x,, 
such that (1) tX must be a continuous function of at least one argument^ 
say ,‘(2) the derivative of to with respect to exists, (3) there 
exists no interval of X„ for which is identically zero, (4) the func- 

tion Lt and its derivatives have the same sign in the neighborhood of ±oo , 
♦’^‘Dodd^ (Annals Vol 27) considers the distribution of a continuous 
function U ( ^ . . , , , • 



SOLOMON KULLBACK 


273 


X, ^ j is similar to that above. 

The probability that ^ ‘ ^ ^7 ‘ ^ 7 

where the and ^ ^ ru are defined 

as for the case of a single function a. , satisfy the conditions 

aJ < (x, < uJ' 


(27) 


is given by 


/ 

I 


< ^2. < < 

< < < < 


(28) 




where the region J5 is defined by the set of inequalities (27). 
We can avoid the difficulty of integrating over the region 'by 
introducing' the discontinuity factor®^ 


(29) /^=r~ 
■ ^ (iTT) 


where F - t for 


and 




F = o for 



de--d%, 

t Jt If ^ 


We can now say that the probability that, y 
satisfy the conditions (27) is given by 


(30) _J_ f g dJL f f e. 

(^rr)^i J I 






.00 VL ^ 


In a manner entirely analagous to the case of a single function 



274 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


LC , we find that 


(31) 


V' n 

(xw) 


'‘-cLk 


vt. 


. t t ) dt--dt 

t J Sj J ■^/ I 




where 

(32) 


e , d:x. 


An application of Fourier’s Integral Theorem®^ to (31) •yields 

where u ) ^ o outside the region of applicabil- 
ity, which shows that ^ given as in (32) is 

the characteristic function of the dJ. of ’ • • % . 

We now state 

Theorem III. If u ^= ^^ 17 ^ x7 ' j J ^ h ^ j "* > » 

which may have a denumerable infinity of discontinuities^ are func- 
tions of the variables ^ whose distribution law 

is given by 7 ^n^ which is on a certain tzt dimen- 

sional manifold R a single valued, non-negative continuous func- 

tion such that ‘ ) die, otx^ d.x^ = then 

the characteristic function of the distribution law of Uy^ 

is given by 


r xy-h” ■■f-iii, ^4»,6c . ■ ■ x„ > 

/? 

Theorem IV. Under the conditions of Theorem III, the dis- 
tribution lam of * • - » * • ^ cC^ is given by 


oa 


rci 

where 


H ^ •—•/.it /Zj —i- » » , — 

> '• U dt-dt 


Vl. 


-ON 7 





SOLOMON .KULLBACK 


275 


PART 2 

Various Special Cases of the Distribution Problem 

F. Distribution of the arithmetic If we take 

** ' assume that 

are independently distributed each according to the same distri- 
bution law, then we find for the distribution of totals 

oO . 1 hB- Tit 

(33) P(a.) = 2^/e ' a ^ 

-oo 

The substitution tt = ?? x will then yield the distribution of the 
arithmetic mean. 

This result has been derived previously by Poisson,^® F. Haus- 
dorff^^ and J. O. Irwin.^® 

Hausdorif applied it in particular to find the distribution of 
means of samples obeying the law J^cx) ^ for ^ x ^ / 

and o elsewhere (a rectangular universe) ; also to the 

- 1^1 

law SCx')^ — oo £ X 1 oo . Irwin has applied 

it to the normal law, Pearson Type III distribution, Pear^n Type 
II distribution and a rectangular universe. 

VL Distribution of the geometric meanti 

Let a= where 

are distributed independently each according to the same distribu- 
tion law, then 

/* omi'hyu r ^ 

(34) /^{cc)=^je cLt { j X fix) ctxj j a.iaixiC. 

'oo 0 - 

The distribution for the geometric mean ^ is obtained from 
that of a. by the transformation -Ar^ . 

o. Consider, for example, the case for 



276 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


Then 


( 35 ) 7 ?.; = ^ J 


“ it {rt 

^ cLt , 


where .-n/if^a-cc ^ o. 

From (35) we have 

TZ.-I - iri -fa^ a - i-z) 

(36) ^ 

ri. 


smce^ 


c 

L 


g.Tr 


O-t^x) 


r;. 


-& e S-}o. 


From (36) we obtain 

(3?) J>c,ur . ‘•if--- 

<3 l f rv 

The result for ri^ 3 has been given by A. T. Craig J 

~x 

/* ^ *s 

b. Suppose now that 


Then 

(38) 


-itu. / p^^ct ■ ^ 


O = X i oo. 


‘■“’■il- (W " 

Let then 


( 39 ) TCcm)- 


m 


rv 

ZTi 


ui y ^ crt 

& { /-ij dz . 


By a method similar to that used for the case of the general- 

* Mac Robert,-^* p, 67, 



SOLOMON -KULLBACK 


277 


ized Tariance (see Section, XIII), we may show that 




pt. a * 


2 e (R) 

SO that the integral converges and 


O 


(40) VCu.) -- - . / e (R) di, 

(RRrc 4 

where C is the contour bounded by the line .x- and that 

part of the circle ^ which lies to the right 

of the straight line. The contour is traversed in a counter-clock- 
wise direction. 


Now (R) = 
(41) TR) 


c-iR'r”' 


SO that we may also write 


/ 7L TV af: 

(-0 yr € . 

s,r,-rHRr ' 


tiv 

The poles of the integrand are of the 77 order and are 
those of (R ) viz., Z = >5:^ Oj/^ 2, — . Since the 

contour is traversed in a counter-cloclcwise manner, the value of 
the integral is Z To times the sum of the residues at the poles 
within the contour so that 

(42) Tcu.)^— V (j - t- - (i ] 


or 


■np-i ^ 

» ^o-'-iSsr'k” 


yti-yiJlH f' j Thi 




-1^] 


c. If instead of assuming the j each satisfy the kune 
distribution law, we assume to be distributed according to 



278 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 
^ ('x,- ) = where none of the i are equal 

or differ by an integers then 


( 44 ) 


or 


T 


zir 




Ji-b 


(45) 


TM 




il fi. • airl 

y-, V • 

The same results as to the convergence of the integral and 
the contour may be shown with respect to this integrand as for 
Section VI b. 

The value of 

€ JJl 

Jsf ' (f ' 

is times the sum of the residues within the contour bounded 
by the j/-axts and that part of the circle |2|= ^ 

which lies to the right of this line. 

For the pole the residue is 

^ 

therefore, 


(46) Tcu.) 


/ 


7^ OO A-H uCfitA) Tf 

c~i) e * 


ig 


1 g 

H=‘l ‘ 


where means that in the product takes, all the values 

except j. . 

Finally, 

r4,-4' 







SOLOMON KULLEACK 


279 


d, Siip})ose that in the previous case ^ ^ ^*5 * ' % 


Since Of 

for this case 


-up 


TT-J. 


77 


Czw) “ /- 


nr 


( 48 ) Tcu-) = ^ 




Th 


CxTf) ^ hnpfmt 


ZT J 

■^oO 

Let Tif^t ni.'t = - Z ^ then 




Cz 7 r)'z l^-ip 


dt. 


- np-f’coo 

iMp np r ^ »' 

( 49 ) Not) = .2 / (c^ n) /-2 

niPp.an J 

Now it may be shown that 

•^OLtCOO 


_J_ 

JUTil 




- u. 


di = e. 


-O. 


where a>o and ^arnpu<^^ (See MacRobert/® p. ISl.) 


Tlierefore 

( 50 ) 


TZj^ie? 

e 7Z 


•776 


jyt 


7t 


fn^ 


n ^ 

Substituting ^ - c we, obtain for the distribution of ^ 


( 51 ) 




?5t7ta^ Vp-f - 7?5 
Yl. /c* (f 


S- 


P 




oO. 


In other words, the distribution of the geometric mean of tz, 
independent variables respectively satisfying the distribution law 
•fi-, _x rP'’ -x , 

J — '- j • . . ", ' ’ ■; — ■ jt <■ 

is the same as the distrilnition of the arithmetic mean of Zi inde- 
pendent variables each satisfying the Pearson Type III distribu- 



280 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


tion law 

jix.) = , Oi X. ^ ^ ■ 

f -p ^ 

■ e. For the case where /“ 0 -^7 ' ' * ' ^ the 

discussion for the generalized variance (see Section XIII)., 

VIL Lemma: The following geometrical considerations will 
for certain cases simplify the problem of finding the distribution 
of statistical parameters calculated about a sample mean. 

^Consider the sample as a point or points (for multi-variate 
distributions) in an tl- dimensional Euclidean space. (This meth- 
od has been employed to great advantage by R. A. ^ Fisher^ and 
others.) Then, if the. probability density at any point (the proba- 
bility for that particular combination of values to occur) is a 
function- of the distance from the origin, The mean value of a 
function of the distance from the origin and of other geometric 
invariants of the system for j - ‘ V ^ satisfying 

the conditions ^ t/ =. o ' . . , , . will be the same as 

for the same function for independent variables in dimen- 

sional space. Since the important element is the distance from the 
origin and the integration is to be carried out over an 77-/ 
dimensional space, the final result is independent of the fact that 
the whole system is immersed in an ??- dimensional space. 

As an illustration, let us consider the following distributions 
which have been derived by various methods. 

VIIL Distribution of variance of a sample of from a 

nonmd population:^* 

Let U - ^ r where the 


X 


X-j ' are distributed according to y 

Then' 


( 52 ) 







SOLOMON ROLLBACK 


281 


(Compare Rider Annals p. 600; Romanovsky,®' Matron p. 
6.) Therefore, the distribution of 


. , 2 - 

V = X, t- x^-t 


■+■ = ns^, where ^ Rj - o ^ 


is given by 

(53) F(v) 

&0 

(see MacRobert,-^'^ p. 67.) 
We thus have 


t 

2ir^ 




' e 


0 -z<rit )¥ ^ \7izL 


77-3 r?S 
‘KLzl X ” tidr'*' 


CL-J y "Z Hif' ^ 

(54) T)(5^)ds-^^^ — - — ^ as is well known 




IX, Distribution of the % of Goodness of Fit Test:^^^ Con- 
sider 


2. ^ 

.2 

AK--f 




/-? = / 


and R:^ is the 

irK 


where R = l/fj , . 

cofactor of ^ in /? so that^ ^ ^ ^ 

are distributed according to 


21 






Therefore 

( 55 ) 


I ^ RifO-zit) 

-=> ^ - 'sR 2- 


.2r 


fut% 

e <tt 




d)c^dz£_'ci}(i^ 


iC 

7 




*• 00 






dx^ ■ Ax^ - dXf 





282 CHARACTERISTIC FUNCTIONS AND DIST RIBUTION 


< 

^ ( 

tT J 


-Lt% 






<1% 


dL‘t 




T>/% 


and we have finally, 


ILii % 

z --Z 


(56) p(7l)d7l- -^Q) e 

> X. 

X. ' 

If we restrict tlie x- ^ to satisfy ^ X =• O » then 
from the preceding, it is dear that <Pft)^ ■3 : — —— 


and now 


(57) 


or 


(58) 


TfXl-Trf- 


^ cLt 


0-Z‘it')^ 






This latter case is the one commonly met with in actual practice 
and is equivalent to the case wherein the expected values are ad- 
justed according to the total in the sample, 

X. SimuUmeous distribution of variances and correlation co- 
efficient of a sample of tv from a bi-variate normal population:^ 
This is a special case of the problem of finding the simultaneous 
distribution of the variances and covariances from an 71." variate 
normal population which has been' solved by J. Wishart,^^ ' The 
same method is applicable to the "general case, but for its own 
interest and 'for the - sake' of • simplicity this special case will be 
considered. ' 



SOLOMON KULLBACK 


283 


- UL = — ; U =: J'-! 


Let it - ii! 


where ,:c . arrqi' v« distributed according to 

- zO-z-V i ai" ~ 55 ^ 0: J 


ZTrr<r^ /7^ 

Now consider ^ ^ > x.-. 

00 ^ I o-£i,)^l 

‘^'■)L ' <r^ <r^ J 


(59) 


J- 



iO-z>'-) 


ZTT (f^ <J^ \[7^^ 


dL% 


[o-ct )o-ct^) -f'^0 


>Jz. 




'h 


“■jth, /^ (Tz/'t^jj.} 


I - ct^ 




I p C ^ 

Therefore, if we add the conditions 2x^o; u z 0 

’ JXf J JZf OJ / 

in which case 


ij « ^ ^ « n.fh 5x ^ _ n 


(60) fO,t.%) = 

Therefore 


^ SzL 




and 


( 61 ) T^(u t\u) - — ~ / / 1 dL 

\ j ^ f X. I I J fy j %/• / Y Sr JL.Pl ^ 


2 } L {o-it,x>'i%) ^ 



284 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


Integrating with respect to t, , we find 

■oo .ct,u., 

( 62 ) [ 

JCD-ct. 


cLt, 


ZTT U.,<‘ e 


p Cn-i-tD 1 
/-ti, -I 


Integrating with respect to t, , we find 


aa 

( 63 ) Je 


.ii:±£.ct(u.yj±if) 

^ !-Ct * /-Z^/ 




' 3- I - ' 4J 


‘ o ij’q 


3 ^ 


Integrating with respect to , we find 

:) -e 


( 64 ) 






O-ct)^ fn^ 


Jl 

"3 


using the facts that 


430 


-ax. tz&x. 

dix. - 


K. e 

o- 




and 


oc, 

/ 


e. ax ZTT A 

ijr- e . 


Therefore we finally find that 


( 65 ) TCu. u. a) ^ 


-aH 

O-/’! * 




6 








V ^ 


yy-..y Sri 




5 > 


or 


(66) 2ils^^s^ysuj. i s . ii»s2.Ai£. 


0 ^d)-^T/Z (Ti 





SOLOMON KULLBACK 


28S 


X. The distribution of the covariance of a sample of n from 
a bi-variatc normal poptdation:'^^ 


Let LL 


P 




0'P^)<U<r^ J--' ^ 

where and are distributed according to 

L_ 

/ ^0-A^)L<r2 <r-J 


LC" Oil S" 

cLx cl'^ 


eO *» 

(67) J-Jf. 




2w fh^ 




If we impose the conditions 

r? yj 

:ic - •o • ^ ff - O 

J! t * 4 = 1 J 


so that u = 




then 

(68) 

and 

(69) 


%) 


n^) - 


O'pV 


TTzt 




O-rO 


ZTT 


nz! .Ltu 

_-L (i-pXii-ci 


dt 


Consider' ^ 

(70) J , _L f 
xw J 


-tzbu. 

I <d.-t 




' <30 




n~f 

X 



286 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


Pi: 

£J ' SO that 


•(71) J- 


77 ** 3 U.CF-0 

u'^ e 


u.i-Coa 

■ ■ " / 

e ' 


(xp)^ XITL I 




Since we may show that 


. 1-^ O 

the integral is convergent and 'we may write 

(72) T f ±_dl_ ^ 

\ Ji.py.rhl ) 

/iiO) lurv / /~S■^ * f tj. I *- 


(72) J 


Cyy Jirt J C-^y^CiF'^y 


f ^ 

where / means that the path of integration starts at infin- 

<aa ' . * 

ity on the real axis, encircles the origin in the positive direc- 
tion and returns to the starting point. (See Whittaker and Wat-r 
son,*® pp. 239, 333.) 

Since ^ , the point 2 = - is outside the con- 


tour so that 


(73) - J— ( 

where is the confluent hypergeometric function.*® 

Also, since = ^e-m finally 

(74) ;g, _ * ^ e K^c^) 





SOLOMON KULLBACK 


287 


If we start with the following definition for the Bessel Func- 
tion of the second kind and imaginary argument^®’ 


(75) 


- 


Vr X 


A jmi' 


•mi'-k I 


t 

/ 


^x.t 

e Ce->) cLt 


then it is possible to show that ^ ^ 

u e 


so that 

(76) 




ir Z ^ ^ 


TL 

X. 


yt , C 

If we finally set V ^ 


we find for the distribution of s/ , 


rH pv 

3 U 


a. 


(77) ^ 

'fir jvd. 

i S. 

which is the form found by K. Pearson, G. B. Jeffery, F.R. S. and 
E. M. Elderton."® 

XII. Do /^ samples, each of -n.- categories, come from the 
same rr- variate normal parent?^^ Consider 


*«- hi "tJ 

'T s: 

pj jZl- /L- 

rfjir-/ /Y 


</«• 




where the^ sioiul- 


taneoiis .distribution of „x^ ^ 

Itpa^ 

(78) e 


, is given by 




where denotes the cofactor corresponding to/J^ in the de- 
terminant H~\P/k\ of the population correlations and 


is the standard deviation of the 


. tip 

r 


variate. 



288 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


Consider 


«a» 


'-B 1. - 


JlR 






"TC^ "X. a * * • 7 


/A 


■ - , ' 

» * 

■7^ 

’ If we impose the conditions .21 = o ^ i, and 

^ ^ i ^ then from the previous results the 

characteristic function for the distribution of X becomes 


and the distribution for 



( 


X. 


(79) PCX)- ^ 


’ -Lt% 
e cLt 

0~zit) ^ 




z 


-X. 

Z 


Cw-zX/V'd 


/-h 




This case is equivalent to applying the Oij test to a contin- 
gency table. If the table has .4:. rows and c columns then the 
value of yil to be used in Elderton's tables of ^"Goodness of Fit*' 
is® ■ 7?' = {‘4-/)(<i - t) -h f [as we saw in Section IX, equa- 
tion 58, the distribution for has an exponent ^2^ (our in- is 
equal to the of the table) and the exponent in the distribution 

above is ]. 

XIII, Distribution of the generalized variance of a sample 
af,/y from an ?t- variate normal population:^^ One of the gen- 



SOLOMON KULLBACK 


289 


eralizatioiis considered by Wilks is that of the sample variance. 
For a sample of N from an n variate normal population the gen- 
eralized sample variance is defined to be the determinant f ^/k I 


where and 

X; - ^ « Wilks has given the distribution of «.= la,, I 

<r /V oc% I ^ 


as an {77-O- tuple integral and has obtained the explicit form of 
the distribution for t, z • 

By employing the theory of characteristic functions we are 
enabled to express the distribution of ot as a single integral and 
find the explicit form for any value of iv . 

The simultaneous distribution of the defined above is 

given^^ by 


(80) 



rtj 




-2/4. 



N-n-2. 


a 


where | is the rj-th order determinant of elements 


hi /f* * 

where is the cofactor of in the 
determinant of parent correlations R - j . 

If we write ^ Bin and a. 


V»c 

tion of the 4'.^ is 


/In ‘‘•no 5 , the distribu- 


(81) 


|5,J 


V - S. B„tr, 




'/H 


F 

Jif 


I hn 1 


yV'- I?-* ’ 


For the sake of concreteness and the better to follow the dis- 



290 CHARA CTERISTIC FUNCTIONS AND DISTRIB UTJON 


cussion for the general case, we shall first consider the cases 
rt = 3^ ^ In detail. 

Case 1, 7/“ 3 .* Let ^ where we write ^ 

The distribution of 4 then given by 


( 82 ) 






3 

X 



where 3 - j (Compare Wilks, Biometrika Vol. 24, p. 

477, equation 10.) 

Let (}(z§ ^ i:± ^ , then 


_y-3 






( 83 ) /^r^) = 


<(;^e ) 14 ^ — / — ■ j — 

= - — — — /e d 


1^ attL 




The integral is taken along the line x = - and since 

// > 3 (since otherwise the distributmn of the <^.^5 is nugatory) 
all the poles of the integrand are to the right of the line x^- ^ , 


Now /T^z == so that \_^ ^'iCnf/Zi: 


but 


■r^ 


ITtst 


so that -} ^ 


ir 






SOLOMON KULLBACK 


291 


Now 


■£ivn- 

2 CO 


If we set £ e 


C/i773 


•I- 


■€-trn 

oo 






2 7r 


fc. 


I 


cm' 


Also 


ftom ^ 

y7-> <>o 

TT* 






cos TTi 


and 

jj -?> oo 


^ /S7i|' 


J»2 ^ CO 


>2. <:<?£: ^ c<9^0 B 




We also have that 

' ' “ *->«» 


tTJQ^. S/ii. e' 


according as nit. » is positive or negative and that 




I CM. T3tj 


< ■l< 


i-m g 


t TJl & 


according as -s in & is positive or negative. 

We find therefore that finally, 

\^Bf:.fZ.r.< y& t'-r. 

i-fotnf I J * ~~ ' ' ' 

according as s/n 6 is positive or native. 

Therefore if f ^ ^ e ; < ^ < -e ; or if 


^ .< Jt. cml 0^0, 


4* * 


C 0 /-? ^- 2 - A? tends tinifonnly to zero as ^ tends 



292 CHAR A CT ERISTIC FUNCTIONS AND D I ST RIB UTION 
to infinity and the integral is uniformly convergent.* 

Next if “€ j ^ e let ? where ir?h\ 

and 7f^ is an integer. Then, 




€ B [^7: fZ^ 1 - 2 - ^ 


^^ 3-3 -tJl^ / ^ € 1^1 

Zi'nf ^ 

7>7~>oo Jt^ 


where Z. M ^ /cscTT^/j = I A>-ec.-Tntj ■ ** 

^ 2- 

Therefore ^ C -0 tends to zero uniformly 

as 777 tends to infinity. 

We can now write 


/v-s 


( 84 ) 




fZzl 

j 2 . ’ Z 


zn 



^Ztli'^l'Z cl 


where C is the contour bounded by the line jc = - and 

that part of the circle /z/= , where tt) may be increased 

indefinitely, which lies to the right of this line; the contour is 
traversed in a counter-clockwise direction. 


The value of 



Pk cl Ttr 


is . zTfC 


times the sum of the residues at the poles within the contour C . 
For ?si7 there is a simple pole at which the residue is /x ^ 

For , there is a simple pole at which the 

♦MacRobertM p. 139, Rule II. 

|lacRol>ert,“<>' p. 114 Lemma. 



SOLOMON KULLBACK 


293 


residue is 


M.i-1 ^ 

(->) T e e> 

3L 


smce 


/r. = - 


-tT 






TT 


S; 




r ; 4-2 


TT- 


c^roJTZr 




and the residue of !I_ for is equal to i±Lt 

^ [Zh 

For where -i. is an integer other than zero, the inte- 


grand has a pole of the second order, viz., that of fZt so 
that the residue is 



e a 

^os TTi: / 2^i /a^/ 




Finally we have 


(85j = 


^3 


1^ I'l&lm 


t!tQ ^ ^ Jt. 

i-’T^rCeV / - + 


Jt*t 


'JL 

dik 


(e^B) 


If we make the substitutions ^ ^ /V" where 

^ “ ^‘hi'l and O ’ —g where A » | we have for tl*e 
distribution of a- 



294 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


( 86 ) d(^)da- 




da. C‘^4) " 




or 



Case 2, '■ With the same notation as before, we find that 


oo 



- OO 


Let i- £t ^ ^ so that 


(89) FC^)-. 


C&et) 


iV~H 







6 k'-t dt. 


A similar discussion as for the case 3 applies here with 
regard to the convergence and we can write here too, 




( 90 ) 




SOLOMON KULLBACK 


295 


where the contour C is bounded by the line (A/>qy 

and that part of the circle |2|= , where m may be increased 

indefinitely, which' lies to the right of this line. The contour is 
traversed in a counter-clockwise direction. 

The value of J= f^-i fSt fS ^ i 

\% XT L times the sum of the residues at the poles within this 
contour- 

For ^-o there is a simple pole at which the residue is /-| /I * 

For 2:- ^ there is a simple pdle at which the residue is -2ir(e By, 


The integrand may' also^ be written as 

2. JL ^ ^ S 

JT T B 

Siv^Ti. Ca^Ti fi-H ^+1 

and the poles are those of and . 

We have already considered the simple poles "z 

For 2 » ^ an integer other than zero, the integrand has a 
pole of the second order, that of at which the residue is 

For Z* an integer other than zero, the integrand 

has a pole of the second order, that of —Lrz at which the 

ck?5 7/^ 


residue is 


' d, 

d.^ st'ri^Tx fin /z-4 





296 CHARACTERISTIC FUNCTIONS AND DISTPruUTION 


We thus find that 


(91) FC4h 


A ^ 

' Z ^ Z f Z / - 2 . 


JL-zirfe-^D) 
1 


'f2- 


4 * 


OO 

Jl=l 




(e^B)' 


icJi 

For the distribution of a , we find 


SlV^TT^fZl 



iifAj 


(92) Ro.) = 


, ^ z 


/a / 2 /-x /-j- 


- 1: f 

;2 .. ■ ■• 




4 K? 


^ {a, A ) 

Cos^WZ 


Jl-i 


Kr 


C^A) 


/jE-y 


— » 3 t, 


Ccw*^ 7t even: As is evident from the previous discussion, 
retaining the same notation, 

oa 

^ci4 ^ 

e 3 K dt . 


( 93 ) Pc^y 


/ 


J/5 

j»r ' 2 


3# -2F 



SOLOMON KULLBACK 


297 


Let 


A/~ ri> 




: t = - ..B 


so that 




fL^ 


( 94 ) le 


I hd ZTTL 

j., / J: 






2 di 


The same considerations as to the convergence and the con- 
tour are applical)le here too and we find that 


( 95 ) m) - 



where C is the contour bounded by the line X ^ 

and that part of the circle /^/“ 777 f j , where 777 may increase 

indefinitely, to the right of this line and the contour is traversed 
in a counter-clockwise direction. 



is XTl times the sum of the residues at the poles within the 
contour. Let us write so that the ' integrand is 


* 

e a 


/ X ^ 




z 



For ? 

(.«) 


yi- 3.^- ' ■ there is a pole of the 

order, the integrand being representable in the form 




/ 

$:i^: ' r ^ . Im \ 




298 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


The residue is therefore, 


(-0 


(-1) ^ nl 


/s ■ ' ' /s-. 






(Jt'tl) 


For B - yz- there is a pole of the 

order, the integrand being representable in the form 


/ T , ■ ■ . /w^ 

to^ 'wi h-^ • • • 

The residue is therefore of the form 


f-O 


j, if, ^ 

i , A i , 

d. e 0 /-£ //-i 


60 





f?Bir 
7 ^ 







For £ » ^ ^ there is a pole of the 

it/x/ order, the integrand being representable as 

^ fc ^ ^2: ir 

C-O ff TT €. 


si-n TTi ctfa n-i 

The residue is therefore of the form 


(-0 w 


F-f 


dit 2t 

e B 


i , ’h^f p / / — “ ! — 

(-!) ' ‘ ' /2-/o^Z 

For H= 

^ _ *A, order at which the residue is 


^ A,-o^/^z .. . there is a pole of the 


c-n fT 

(-') c^-o' 


r M 


4 * *■ 

e e 





SOLOMON KULLBACK 


299 


We have therefore that 

( 96 ) TC^)- 


f-) ^ rp 4 t I , ! , r 

■/ j V-o ^ U e e jM-irlw-f -jp-i-i 


(d-V ' 

Tt, 

J 

J=» '2 


di 


/iZ ft Iz 


. V-<) " U e 5 y 


Mi- 


fz-ll-i- -jf-l-Zr j^l-if -ij/tf-Z-i- —j^ 

fLu fi-L'-- L-Af-i 


- d>-i 

to r id. 


e B 









300 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


/-t Ji-Tt: lf>-i-2r j'U-i-i: 


f 


A,: 


•' fd-, 






\f-/J ?r^ 




Ci^-O!- ca$ JitH /iEf- ^ • /s 


-^■=.0 

«K? 




f i>-f 

(-!} TT'^i^ 




0^/^f 

] 

sr-n’^Ti: 1^, fTt-^ ■ 





with 

Case 4, rt odd: As before we find that 


(98) 


^Ci)^- 


Ce^e; 







Let 

The integrand is 



The considerations are similar to the case' for n-'even except 


that the integrand has an additional factor, viz. 

For Cp-t) there is a pole of the 

order at which the residue is 






f^, TF -/iZL 


^ — >C#”/, 





SOLOMOM KULLBACK 


301 


For 5 ^ I t- 


there is a pole of the 


order at which the residue is 

f-O ^ fd.'" ■'7^'- ^ 


U St 

(-0 JL I L 




For 2 y ^ jz^ • there is a pole of the 

j- f-Zt. order at which the residue is 

.j>? 

6') r 


Z 

e ^ 


For z = s o, (, 2^- ■ • there is a pole of the 

-^ - t-h order at which the residue is 


■fi-!-' pt! f ,-f>~l 

t i l X 


B 

e 13 




since the integrand is representable as 

to tt'^ r^e 0 

We have therefore that 

(99) F(^)- 




ji^M yil 



Xf^u- e%%a ■■[;-. ] 





302 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


+ 


C-l) TT'f^ I (i. 




4^ t 

e e 




2: 


-pCi>^r-^f-<yt-l C ■*>-! 

k-0 I d 


4z- i 
e B 








The distribution for a. is 

( 100 ) D(o-)=- 


^ ^ f \ • nX^'^ C 2r f I /"*" t' '■ " . . — 

A CL I u-o ^ /j-^ l%.-z jjin-tj-nn-i "fp- 




Il-3, ■•■lt>-1= Ut%'ir -j^^. 

L^' 

EaO 




f ^ i° 

2 : 

\C -0 7 "^ 

d ^ 

(a./U 

Z i 

di.-^ 







+ Vc~*) 7--^' 


C-4/ 

Z ' i 


, ■^/"/ ,/ /-^ j 

/£/./ 

"'witlt. a 





SOLOMON KULLBACK 


303 


It is of interest to derive from the general formula the distri- 
bution when z « 

For n- / the value of in equation (100) is zero. The 
expression in the brace in equation (100) becomes 




■— cuA 


// 




~ e 


so that 


(101) 


DCo.) 


/VzJ (iti -cu A 

A "a. ^ e 


For Nru z the value of in equation (97) is L The ex- 
pression in the brace in equation (97) becomes 

fiCY>\ ^ rrCa.Af’ 

(102) yu f- y- . . , 

fZ /iL fZ 


3/i. 


TfCo-f^y frM) ttCxA) 




nn 








R(iX.A ) 




fl 


II 


x! 


si 


+ • 


•fz - a V«l /4 

r e ; 


there is no difficulty ' about combining the infinite series in equa 
,tiqn (102) . since each is absolutely convergent for all value' of 



304 CHARACTERISTIC FUNCTIONS AND DISTRIBUTION 


Therefore, 


( 103 ) 


2)(kj = 



/V“3 - 

»■ ' 5t. 

CL e 







A^-3 




■ a. 






A/-Z 


The explicit expressions for /V ^ have already been 

obtained otherwise by Wilks.^^ 

PART 3 
Conchision 

XIV- Summary and Conclusions- By the use of a disconti- 
nuity factor derived from Fourier’s Integral Theorem we obtain 
the characteristic function (in the sense of P. Levy) of the dis- 
tribution law, and the distribution law of very general functions of 
variables satisfying a continuous distribution law. In the appli- 
cation of the general theory a certain lemma is found to simplify 
the calculations for a particular class of distribution laws and 
functions. Several of the distributions derived are presented not 
because the results are new but as illustrations of a general 
method of procedure which it is hoped will enable us to find the 
distribution laws of many functions not yet obtained. 

The explicit form of the distribution of the generalized sample 
variance' for an yi-^variate normal population is derived. The same 
analysis is applicable to find the explicit form of the other gen- 
eralizations introduced by Wilks, for general rt , since the inte- 
grals that must be evaluated are all of the same general nature. 
The writer hopes to be able to present these further results in the 
near future. , ^ 

NOTE 

After, this paper Vbad been completed, thewriteris attention , was drawn 
to the fact that^ ah, analysis very similar to that of 'Sections VIII, X, and XI 



SOLOMON KULLBACK 


305 


of this paper had already appeared in two papers by Wishart and Bartlett^ 
viz : 

''The distribution of second order moment statistics in a norma! system ” 
Proc. Cambridge Phil. Soc. VoL 28 (1932) p. 4SSf. 

"The generalized product moment distribution in a normal system/* 
Proc, Cambridge Phil. Soc. Vol. 29 (1933) p. 260. 

These sections are, however, presented here as illustrations of the 
Lemma of’ section VII. 

BIBLIOGRAPHY 

1. Boucher: Introduction to Higher Algebra, pp. 30-33. 

2. Cauchy: Comptes Rendus, Vol. 37 (1853), pp. 100, 150, 198, 264, 326. 

3. Charlier, C. V. L.: Arkiv Fur Math. Astron. Och Fysik. Vol. 2 
(190S-d) No. 8, No, 15; Vol. 4 (1908) No. 13; Vol 5 (1909) No. 15; 
Vol. 7 (1912) No. 17; Vol. 8 (1912) Nq. 2, No. 4; Vol 9 (1913) 
No. 25, No. 26. 

4. Czuber, E. : Wahrscheinlichkeitsrechnung, I (1914), p. 66. 

5. Dodd, E, L. : The Frequency Law of a Function of One Variable. 
Bull. Am. Math. Soc., Vol. 31 (1925) p, 27. 

6. Dodd, E. L. : The Frequency Law of a Function of Variables with 
Given Frequency Laws, Annals of Math., 2nd S., Vol. 27 (1925), pp, 
12 - 20 , 

7. Craig, A. T. : On the Distribution of Certain Statistics, Am. Jour, of 
Math., Vol. LIV (1932), pp. 353-366. 

8. Fisher, R. A. : Frequency Distribution of the Values of the Correlation 
Coefficient in Samples from an indefinitely Large Population, Bio- 
metrika, Vol. 10 (1914-15), pp. 507-21. 

9. Fisher, R. A.: On the Interpretation of X from Contingency Tables 
and the Calculation of P . ; Jour. Roy. Stat. Soc., Vol. 85, p. 87. 

10. Gronwall, T. H.: The Theory of the Gamma Function; Annals of 
Math., Vol. 20 (1918-19), p. 48, Th. XIII. 

11. Plausdorff, F. : Beit rage zur Wahrscheinlichkeitsrechmng, Koniglich 
Sachsischen Giesellschaft der Wissenshaften zu Leipzig. Berichte uber 
die Verhandlungen Math. — Phys. Classe, Vol S3 (1901, pp. 152-178. 

12. Hobson: The Theory of Functions of a Real Variable (1907), p. 590. 

13. Irwin, J. O. : On the Frequency Distribution of the Means of Samples 
from a Population having any Law of Frequency with Finite Moments 
with Special Reference to, Pearson*s Type 11; Biometrika, Vol 19 
(1927), pp. 225-39. 

14. ' Kanieda, T, : Theorie der erzeugenden funktion und , ihre anwendung 

auf die Wahrscheinlichkeits-Rechnung, Proc. Math. Phys, Soc. ; Tokyo, 
Vol 8 (1915-16), pp. 262, 336, 556 ff. 

15. Kameda, T.: Eine Verallgemeinerung des Poissonschen Problems in 
der Wahrscheinlichkeits-Rechnung;' 'Proc. Math, Phys. Soc., Tokyo, 
Vol 9 (1917-18), pp. ISS'C ’ 

16. Laplace: Theorie Analytique des Probability 3rd Ed. (1820), pp. 3ff ; 

pp. 80ff. '' ' '' ' ' 



306 CHARACTERISTIC FUNCTIONS AND DISTRIB UTION 


17. Levjj P. : Cakul des Probabilites, p, 161. 

18. Levy, P.: Comtes Retidm, Vol. 176 (1923), pp. 1118-1120; pp. 1284- 
1286. 

19. Levy, F. : Bull, de k Soc. Math, de France, Vol. 52 (1924), pp. 49-85, 

20. MacRobert, T, M. : Functions of a Complex Variable (1925), 

21. Molina, E. C. : The Theory of Probability: Some Comments on La- 
placer's Theorie Analytique; Bull. Am. Math. Soc.; Vol. 36 (1930), 
pp. 369 ff. 

22. Pearson, K.: On the Criterion that a Given System of Deviations 
from the Probable in the Case of a Correlated System of Variables is 
such that it can be reasonably supposed to have arisen from random 
sampling.-— Phil Mag., 5th series, Vol 50 (1900), p. 157. 

23. Pearson, K. : On the Distribution of the Standard Deviations of Small 
Samples: Appendix I to papers by “Student’^ and R. A. Fisher, Bio- 
metrika, Vol. 10 (1914-15), pp. 522-29. 

24. Pearson, K. : On a Brief Proof of the Fundamental Formula for Test- 
ing the Goodness of Fit of Frequency Distributions and on the Probable 
Error of P. Phil Mag., 6th Series, Vol. 31 (1916), p. 369. 

25. Pearson, K. : Jeffery, G. B»; Elderton, E. M. and F. R. S. On the 
Distribution of the First Product Moment Coefficient in Samples Drawn 
from an Indefinitely Large Normal Population, Biometirka, Vol 21 
(1929), pp. 164-201. 

26. Pearson, K. : Stouff er, S. A., and David, F. N. Further Applications 
in Statistics of the (^) Bessel Function. Biometrika, Vol 24 
(1932), pp. 293-350. 

27. Poincare, H.: Calcul des Probabilites, 2nd Ed. (1923), p. 206. 

28. Poisson: Connaisance des temps de Tannee, 1827, 

29. Poisson: Recherches sur la Prob. Chap. IV. 

30. Rhodes, E. C. : On the Problem Whether two given Samples can be 
Supposed to have been drawn from the Same Population. Biometrika, 
Vol. 16 (1924), p. 239. 

31. Rider, P. R.: A Survey of the Theory of Small Samples; Annals 
of Math., 2nd S., Vol 31 (1930), pp. 577-628. 

32. Rietz, H. L. : On a Certain Law of Probability of Laplace; Interna- 
tional Math. Congress, Toronto, Canada, 1924. 

33. Rietz, L. : On the Representation of a Certain Fundamental Law 
of Probability; Trans. Am. Math. Soc., Vol 27 (1925), pp, 197-212. 

34. Romanovsky, V.: On the Moments of Standard Deviation and of 
Correlation Coefficient in Samples from Normal Metron., Vol 5 
(1925), No. 4, pp. 3-46. 

35. Schpls, Ch. M. ; Demonstration directe de la lot limite pour les erreurs 
'dans^Je planet dans fespace. Annals ^ d'Ecole Folytechnique' de Delft. 

,;:V,'Vbl^,,3;.(l887)>>^l95, 

36. **Studenf': '' The ' Probable Error of a ^ Mean; Biometrika, ' Vol 6 
(1908-9), pp. 1-25. 

37. Watson, G. N.: A Treatise on the .Theory '6f',, the'. Bessel', Function. 



SOLOMON KULLBACK 


307 


38. Webster, A. G, : Partial Differential Equations of Math.- Physics 
(1927), p. 158 IT. 

39. Whittaker & Robinson: The Calculus of Observations (1924). 

40. Whittaker & Watson: Modern Analysis, 2n(! Ed. (1915). 

41. Wilks, S. S. : Certain Generalizations in the Analysis of Variance, 
Biometrika, Vol. 24 (1932), pp. 471-94. 

42. Wishart, J. : The Generalized Product Moment Distribution in Sam- 
ples from a Normal Multivariate Population. Biometrika, Vol. XX A 
(1928), pp. 32-52. 



ON MEASURES OF CONTINGENCY 


By 

Feank M. Weiba 

L Introduction. When we deal with the problem of relation- 
ship of attributes, we may classify each attribute into a number 
of groups. To illustrate : If the attributes are 
and if the group belonging to X, is x,*" (L ^ ^hat 

belonging to is CJr * o 3^ • - • , . . . , that belonging 

to Xi, is ^ 3^ ^ , . . . , we may form an 

w X 777^x • • * X TTzx • * table which contains 777 ^ k " 

compartments. In this fashion, it is possible to distribute the total 
frequency of the ^‘universe'' or the *'sub-universe'’ into sub-groups 
which correspond to these n7, xm^a — x • •• compartments. 

For such situations, Pearson^ and others^ have suggested cer- 
tain measures of relation between the attributes. We shall in this 
paper be interested primarily in Pearson’s measures of contin- 
gency. In the case of two attributes, Pearson proceeds as follows : 
Suppose that /I is any attribute and let it be classified into the 
groups - /y -2> ^ ' * ■ ' ' ^ and let J3 be another attribute 

classified into the groups Bj; C/' * 6 ^t). Let the total 

number of individuals examined be A/ . Now, the probability 
a-priori of an individual falling into the respective groups is 
where is the number which fall into 4i . Again, if 
is, the number which fall into ^ , then the probability a-prioii of 
an individual falling into the respective groups is 
where 7^- is the number which fall into . If the attributes are 
independent in the probability sense, then, if N pairs of attri- 

^Pearson, Karl, the Theory of Contingency and its Relation to 
Association and Norinal Correlation,” Drapers' Company Research Me- 
moirs, Biometric Series L ; Dulau & Co., London, 1904. 

Yule, 'G. IJdny, *An, Introduction to the, Theory, :<rf Stafisfics,” .Charles 
Crlllin^;:,& Company^' Limited, 'London, 1927, 'pp. I7rr74. 



FRANK M, WEIDA 


309 


butes are examined, the number expected in the C^J') compart- 
ment is 


AJ’ 








N A! 




U,, 

V 


Suppose the number observed is . Then, if we allow for 


the errors of random sampling, ( ) is the departure 

from independent probability of the occurrence of the groups 
Ay3 • Then, any measure of the total departure from indepen- 
dent probability is termed by Pearson a measure of contingency. 
Consequently, the measure of contingency is some function of the 
““ ^lj:) quantities for the whole table. 

Again, for a given 


X 


2 - 



%' ) 


Pearson has shown how to obtain the probability® P as a measure 
to determine how far the observed system is not compatible with 
a basis of independent probability. He calls O^F) the contin^ 
gency grade and ^ 



the mean square contingency. Also, 

-W- ^ ~ 

" N 

is the mem contingency when ^ refers to summation for all pos- 
itive terms. 

In his theory of contingency, Pearson appears to use the defi- 
nition of probability used in practically all treatises on the subject. 

® Pearson, Karl, **On 'the criterion that a given system' of deviations 
from the probable In the -case of correlated system of variables is such, that 
it 'Cand»e reasonably supposed tO’ have arisen ' from random sampling" Phil. 
Mag., Series V. 1. 157-175. ' 



310 


ON MEASURES OF CONTINGENCY 


This definition excludes the whole field of statistical probability. 
It appears fairly obvious that the development of statistical con- 
cepts is approached more naturally from a limit definition for 
probability than from the familiar definitions suggested by games 
of chance. It is the purpose of this paper to improve the treat- 
ment of Pearson's theory of contingency and make it more ele- 
gant for theoretical as well as empirical discussions. To accom- 
plish this we make use of the notion of char act eristic function"^ 
and a definition of probability that includes all forms of proba- 
bility. It is believed that we have thus idealized Pearson's con- 
ception of contingency. We discuss multiple as well as partial 
contingency. We also consider briefly the case of certain dependent 
events and the concept of mutual exclusiveness, as well as the 
concept of connection. 

2. Definitions and assumptions. Tn our discussion we need 
and use the following definitions and assumptions 

Assumption L If an event which can happen in two different 
ways be repeated a great number of times under the same essential 
conditions, the ratio of the number of times that it happens in one 
way to the total number of trials, will approach a definite limit 
as the latter number increases indefinitely. 

Definition L The limit described in assumption I we call the 
prohahUity that the event shall happen in the first way under these 
conditions. , 

Assmnpiion 11. If an event can happen in a certain number 
of ways, all of which are equally likely, and if a certain number of 
these be called favorable, then the ratio of the number of favor- 
able ways to the total number, is equal to the probability that the 
event will turn out favorably. 

Assumptidn^ II L If an event depend on n. independent varia- 


^ ^ Tbe, characteristic , function of A is that function which i$ equal to 

unity for the; ekwMsnts'of; A and zero elsewhere. ' Usually A h assumed to 
a su'b-class of some class on whidi: the Characteristic: function; is defined. 

J. U, '*An /Introduction to^ ^Matheniattcal ' Prohability,”' The 
'-.parendon' Press, 1925, pf. 'W;2;' 



FRANK M, WEIDA 


311 


bles ^ which can vary continuously in an tv 

diiiiensional continuous manifold, there exists, such an analytic 

function TX ^ that the probability for a result corre- 

spondiiig to a group of values in the infinitesimal region 


XiiM, , x± 


id-X^. 


X. 


*idX, 


differs by an infinitesimal of higher order from 


Definition 11. If a variable X take the different values 
(l ^ • ,r?) with the respective probabilities (C- ■ yn) 

and these are all the possible values for that variable, then 


i ^ X, 

is called the mean value of tlae variable X . 

Definition III. Two variables are said to be independent if the 
probability that one lie close to a given value is independent of the 
value of the other. 

3. Pearson*s mean square contingency. Let the attributes be 
X and Y * Let be the number of individuals having the 
group value Xy of X and X' Y . The total number of 
individuals having the group value X X ^ ^he 

total number of individuals having the group value Xj of X is 

j ^ 

0 .. . The total number of individuals examined then is , 
Now, suppose it is true that 



be, respectively, the mean values of ^ ^ * 

: ’ A' repeate^i' Index uneans sumniktion^ for all 'possible' values of sucli 
repea'ted ■ index, V,". ' ■ 



312 


ON MEASURES OF CONTINGENCY 


Since, in the case of independence, the mean of the product is the 
product of the means, ^ we have 


( 2 ) 




V- 


4- ^ 


Now, if is the characteristic function of the observation, 
has the value unity if the event succeeds and zero if the event 
fails. Let be the probability that the event succeeds and <^i 
the probability that the event fails. Then, the mean value 




of 


is given by 

(3) 

Similarly, 

(4) 

(5) 

( 6 ) 


4- - 


4 = 






- -fii 


4 " ^ , J" 

-f- o - 'A • 




But Aiy ' ^ > hence, in the case of independence, Fq = . 

Hence, from (2), (3), (4), (S), and (6), in the case of inde- 
pendence, we have 


( 7 ) 


= Ay 


F ■ 


In the case of dependence, we have that 

(8) . -A,. = /^(^v {4^ j # Ay i4 


where M (4;- ) is the mean value of <t>^, 

y ^ ^ V t/ 


The quantity j represents the departure be- 

tween the mean value 52L has and that which it should have in 
the case of independence. ' 

Let us now consider the square of the departure relative to 

^ Coolidge, J. L., ”Ab Introduction to Mathematical Probability/* The 
'Clarendon^ Press, 1925, p. 62. 

®Tschuprow, A- A., **Grundbegriffe und grundprobleme der Korrela- 
tionstheorie/* B. G. Teubner, Berlin, 1925, pp. 39-63. 



PRANK M. WBIDA 


313 


, namely, 


t 


For al! cases, we have 


fy) 

a" 


(9) 


f-(rx 


which is PearsonA mean 


square contingency and ^ 


Hence, it appears that we may interpret Pearson's mean square 
contingency as a coefficient of dispersion, namely, a measure of 
the deviation between the mean or expected number a cell should 
have in the case of independence and the mean or expected number 
it actually has relative to the mean or expected number a cell 
should have in the case of independence as a unit of measure 
summed for all cells* 

4. Multiple and partid contingency. In the case of thr^e vari- 
ables, suppose that it is true that 


( 10 ) 

where E., 


As before, in the case of independence, 



Again, if is the characteristic function of the observa- 

tion. 



From (10), (11), and (I?)* ifi the case of independence, we 



314 


ON MEASURES OF CONTINGENCY 


find that 

and in the case of dependence, we have 

The quantity fl-f,) represents the depar- 

ture between the mean value has and that which it should 

have in the case of independence. 

We now consider the square of the departure relative to 

~ ^ 

^ /v ^ ^ ^ 


For all cases, we have 


which we call the mean square multiple contingency in the case of 
three variables or attributes. 

In general, in case we have re attributes: 


(16) ir. 

Tti.*'*- 
» # a w 

and for all cases : 

( 17 ) 


" T~~ 


^ N 


, ^77 

ii.c-c_ ■■•■ ■■■ -he..: 


*1 t in Kr ' * ^ 4 * 

i -ri-j: 


which we call the mem square multiple contingency in the case of 
71 . .attributes. 

, , ^ Let us, again consider the case of three attributes. We may 
write 



FRANK M. WBIDA 


315 


.■J< 


^ = m - mi = Hx'' 


For a given , 
(18) 


K - (f^)^ 


is the partial mean square contingency between two attributes for 
an assigned tliird attribute. 

I£ ^ ^ for every fc -2^ 5 ’ j , then 

Similarly, if and are zero for every C and every j, , 
respectively, then 

^ 0 , and 


We have thus proved the theorem, namely, 

Theorem 1: The necessaryand sufficient condition for the three 
attributes to be independent is that 



Itis'lairly.easyto'seeihat.'in.'the'cas^'of 70 attributes, wC'^have' 



316 


OM MEASURES OF CONTINGENCY 


For a given set 


Cs-c 

where ^ ^ is the partial mean square contingency between 

two attributes for an assigned set of (n-z.) attributes. 


= O for any pair ^ , and for every associated 


set 


u , then 

) -n, } 




^ Hence, we have the 

Theorem 2: The necessary and sufficient condition for com- 
plete independence in the case of rv attributes is that for every 
pair is true that 






Again, it is fairly easy to see that in general different values 
assigned to the set ^ ^ ‘ ‘ result in corresponding 

different values for (5 • - / . Hence, if . 

^rt 

is the weighted arithmetic mean of these different values where 
the respective weights are the relative numbers of individuals in 
each sub-set, then we say that 


ct> X i* O' 
3 4 


is the partial mean square measure of contingency* 

5. Mean square dependence* Rietz^ invented games of chance 
which give , a meaning to correlation in pure chance. The writer^ 
believes it important at least formally to propose a measure of 

- ' ® Rietz, 'L.,'**Urn schemata as, a basis for' the development of cor- 
relation theory/’ Annals of mathematics, Vol. 21, 1919-20, pp, 306-322. 



PRANK M. iVfilDA 


317 


dependence based upon a i>robabiIity schemata. As Ix-fore, let the 
attributes be X and Y . 

Let us assume that 


/r- = f ij) . Then. 

§ ftp S / ) - whence. 


^p. -- r^- 

where is the mean value of /' j ^c ' j ‘'jj) 7?' 

is the mean value of F-‘ . ^ ^ 

The quantity - ff/) represents the departure from de- 
pendence for the particular ^ <li,scussion. 

We now form the quantity^., defined as 

0 



which is the square of the departure relative to . 

For all cases, we have 

(M) 

which we call the vwan square dependence. 

Our concept of dependence may be extended to cases of more 
than two attributes and measures of multi|)le as well as partial 
dependence may be obtained in an analogous fashion. It thus 
appears that we have, at least fornially, a general criterion for 
dependence and an approach to a general criterion which may 
Sytvvt z measure of goodness of fit, 

alsO' note^ that, in ■ every contingency table the /'events rlesig-' 
hated by the or art amtually exchidve for every e and\^ ^ 
6. A measure of connection. We here propose to ideah 
Gini's measure of connection which has been fully discussed by 



318 


ON MEASURES OF CONTINGENCY 


the writer elsewhere,^® Gini's measure of connection is of interest 
and importance since one of his special indices of connection is 
Pearson's correlation ratio and one of his special indices of con- 
cordance is Pearson's correlation coefficient. These facts are es- 
tablished in my paper referred to above. 

As before, let represent the number of individuals having 
the group value of ^ and X' oi Y in case we have the two 
attributes X and X . The total number of individuals having the 
group value X of Y is and the total number of individuals 

having the group value X/ of X is . The total number of 
individuals is . The frequencies of Y are distributed accord- 
ing to a set of ‘‘partial" groups which correspond to the respective 
modalities of X • H all the “partial" groups are similar to the 
“total" group of frequencies of Y , then the distribution of mod- 
alities of y is independent of the modalities of X and Y is not 
connected with X . In other words, Y is not dependent upon X 
but is independent of X in the probability sense. Again, if at 
least one of the “partial" groups is not similar to the “total" 
group of frequencies of Y , then the distribution of modalities of 
y is dependent on the modalities of X and Y is connected with 
X . In other words, Y is dependent on X and is not independent 
of X in the probability sense. 

We now multiply the frequencies of each “partial" group by 
a number such that the total frequency of each “oartial" group 
is the same as the number of cases examined. For a given cell, 
the frequency is then vv. - and the total frequency of this “par- 
tial" group is then fq - <Pq- • 

Let us now consider the quantity defined by 



The mean, value of, fq is and the mean value of Wj is 


Weida, F* M., ‘'On various conceptions of correlation," Annals of 
'Mathematics, V,o!. 29, 'No' 3, July 1928, pp. 276-312. ' ' 



FRANCK M, WEIDA 


319 


. If Mij, is the mean value of then 
( 24 ) ^ 

We now consider a quantity dy defined by 


( 25 ) 


4 • 


which is Gini’s simple index of dissimilarity and may be regarded 
as the sum of the absolute values of a set of mean values. 

We now consider the quantity 
cL; is dk"! d: . 

For all cases, the mean value I is given by 

YJC 

ir, = . 

which is Gini’s measure of connection of Y on%. Thus, Gini’s 
measure of connection may be regarded as the mean value of a 
set of sums of absolute values of mean values. An analagous dis- 
cussion holds for which is Gini’s measure of a connection 

of X on y . 

It is fairly easy to see that the process may be extended to 
derive measures of multiple, partial and complete connection. This 
the writer intends to accomplish at a future date. 

7.- Conclusion. It is believed that we have shown that the 
theory of contingency, dependence and connection may be based 
upon a definition of probability that includes all forms of proba- 
bility. Fluctuations in random sampling appear to be neglected in 
such a treatment, however the experiments may be carried out 
with the probability schemata in case we desire the inclusion of 
fluctuations in random sampling. „ ; 

;'Tlte;';George\:^yashington-tJt^^ 


it 


I- 4 


The mean value of 



NOTE ON KOSHAL’S METHOD OF IMPEOVING, 
THE PARAMETERS OF CURVES BY THE USE 'OF 
THE METHOD OF MAXIMUM LIKELIHOOD 

By 

R. J. Myers 

It has been shown by R, A. Fisher^^^ that the most efficient 
parameters for Pearsonian curves may be found by the method 
of maximum likelihood. In applying this method we maximize 
the quantity 

(1) L - X TL^ 

by varying the parameters of the curve ; denotes the observed 
frequency of the k class, and is the probability of an ob- 
servation falling in this class as determined from the curve and 
is thus a function of the parameters. Thus, in maximizing L , 
varies as the parameters are varied, but remains constant 
throughout since it is fixed by the given data. 

Usually it is impossible to obtain a solution to the maifitniitn 
likelihood equation so that some method of approximation must 
be used. R. S. Koshah^** has devised a very ingenious method 
of approximation, which can be summarized briefly as follows. 
Values of I are obtained first by varying only one parameter at 
a time, and then by varying two parameters at the same time. 
When only one parameter is varied, two values of /_ are com- 
puted for each parameter, whereas in the case of two parameters 
being varied, only one value of L is computed for each combina- 
tion of parameters. Thus, or 

values of I would be needed for n. parameters. With these L’s 
the constants of Tt, simultaneous equations involving the cor- 
rections to the Tt. parameters can be determined, and t h e n the 
corrections themselves can readily be obtained. 

In appljdn^ this method a number of interesting results were 



R. J. MYERS 


321 


obtained. The data used was the same as used by Koshal*-’ 
because in checking through his work there were found several 
serious numerical errors, especially in the computation of j3 . 
This gave a poor fit so that the method of maximum likelihood 
had more opportunity for improvement than if there had been 
no error. These data are distributed according to a Type 1 dis- 
tribution, whose general equation is 

( 2 ) (/ 3 -^) 

The values of the parameters as obtained from the moments are 

= .33461 

/3 = 16.9885 

w, = .69753 

= 4.93202. 

The most convenient sizes of the increments for the parameters 
were chosen, namely .1 for T , tv, , and ?n^ and 1.0 for/? . 

In the case of the i s in which only one parameter is varied, 
Koshal selected the two L's to be computed for a particular para- 
meter in the following manner: it should be remembered that 
^oooo ' the value for the unaltered parameters, has already been 
computed. As an illustration let us consider the L's computed for 
variations of . The criterion set up was that should be 

greater than either , where x may be —2, 

— ^1, or 0. This criterion is justified by the common sense reason- 
ing that the maximum likelihood solution will then lie somewhere 
between and ■ However, in the case of the l’s 

in which, two parameters are varied, Koshal merely selected the 
combination of the increments at random. Thus, for the L for 
r and , Koshal computed L, , . In carrying out my com- 
putations I thought it best to use the same criterion on the l's 
in which two par^Oters Were varied, as was used on the I s 
in which only one parameter wa^^ ^ ■ For example, I gave 

various vahie.s to x and ^ so that a number of values of 



322 


PARAMETERS OF CURVES 


were obtained. The largest of these was used in the determination 
of the constants as explained before. It was not necessary to give 
all values to x and ^ because a good many combinations could 
be discarded by inspection. For example, if was greater 

than L| , it obviously was not necessary to calculate . 

The above process was repeated for the other is ? and the 
constants were then determined. From these the corrections to 
the parameters were obtained; these corrections gave new para- 
meters as follows: 

^ = .38399 

0 = 16.5020 

w?, ^ .72547 

77?^ 4.80853. 

The frequency distribution obtained from these parameters w^as 
quite a bit better than the original one as judged, by both the % 5 
test and its likelihood. However, it is important to note that two 
of the double increment is used in obtaining the constants were 
greater than the L obtained from the new parameters. This 
would seem to show that better results could be gotten by judicious 
guessing than by using this method of approximation. Another 
fact illustrating the roughness of approximation is that the values 
of the constants when computed from other of the double incre- 
ment L'S vary by as much as 30% from those previously used. 
Naturally with different values of the constants, different values 
for the corrections to the parameters would be obtained. Several 
combinations of different values of the constants were tried, and 
a few of the resulting frequency distributions gave higher L^s 
than the ones obtained 'previously, although there were none higher 
than the two subsidiary Ls previously mentioned. It is not un- 
likely that a combination of constants might be found so as to 
.yield a higher L than either of the^ latter two, but there would 
have to be a considerable amount of manipulation in order to 'find 
this' combination. , . 



R. L MYERS 


323 


Another disadvantage of this method is the fact that a great 
deal of time is required to apply it. Approximately sixty hours 
were required to carry the calculations for the Type 1 curve, 
Aiiotlier interesting fact vras brought out when the method of 
I'^earson and l-^airman^^^ for correcting the moments for group- 
ing was applied to the original data. The frequency distribution 
obtained was far better than any previously obtained as shown by 
the fact that the L for this distribution was highest of all ; Xs 
for this distribution was 4.64. The time required to apply this 
method was considerably less than needed for KoshaFs method. 

Since writing this paper my attention has been directed to the 
recent article in the Journal (VoL XCIII, Part II, 1934, p. 331) 
by W. P, Elderton and G. H, Hansmann. In this paper the writers 
used the same data as Koshal and fit these data by an ingenious 
method due to Eldertont^h It is interesting to note that the 
of the distribution obtained by Elderton azid Hansmann is prac- 
tically the same as that obtained when the method of Pearson and 
Pairman was used. Elderton and Hansmann also came to the con- 
clusion that KoshaFs method required more labor to bring about 
the same results as other methods. 

BIBLIOGRAPHY 

1. Fisher, R. A. **On the Mathematical Foundations of Theoretical Sta- 
tistics.*’ PhiL Trans., A, vol. 222, pp. 309--368. 

2 . Koshal, R. S ** Application of the Method of Maximum Likelihood to 
the Improvement of Curves Fitted by the Method of Moments.” Jmr. 
Rayml Stat. Sac,, vol XCVI, pp. 303-313. ' 

3, Pairman, Eleanor and Pearson, Karl ‘‘On Corrections for the 'Moment- 
Coefficients' of Limited Range Distributions .when there are Finite or 
'Infinite’ Ordinates and any .Slopes" at the Terminals of the RangeA' 
B'imncMka, rot 12, pp,' 231-258* 

4, Elderton, ^W. P..^ ‘frequency ' Curves and Correlation,” pp., 121-122, 

, ,'2nd .edition.,, , , 



THE ADEQUACY OF "STUDENT’S” CRITERION 
OF DEVIATIONS IN SMALL SAMPLE MEANS* . 

By 

Alan E, Treloar and Marian A. Wilder 
Biometric Laboratory, University of Minnesota 

INTRODUCTION 

The origin of the movement toward precise evaluation of prob- 
abilities based on the statistics of small samples would generally 
be located by practical statisticians in the work of ‘‘Student^^ 
(1908). The problem he considered is of such importance, not 
only from the historical aspect, but also from a consideration of 
the elements of statistical interpretation, that we wish to return to 
an analysis of the adequacy of his solution. ^'Student** was con- 
cerned with the problem of determining the significance to be 
attached to the deviation of the mean, of a small sample from 
a probable (or possible) supplyf mean, w, when the^ dispersal of^ 
variates in the supply is unknown. The solution he suggested was 
based upon derivation of the probability integral of the quantity 

( 1 ) 2 = , 

where jt is the standard deviation of the sample. He found the 
distribution of s to be given by the equation, 

( 2 ) dj 

In ;191S, Fisher indicated that '‘Student's’' partly intuitive deriva- 
tion was sound, and in 1925 • he returned ^ to a more complete 
exposition of the accuracy of the solution, at the same time widely 

♦ Pressmled in part before a Joint Session of tbe Econometric Society 
and Section; K of tbe^ American Association for the Advancement of Science, 
Boston, Dec. 30, 1933. ' 

t Following Wicksel! («.g. Biometrika 25, p, 121), we shall use the term 
^supply** in ' place, of “population/* ' 



A. !L TRELOAR AND M, A, WILDER 


325 


extending its application. Fisher at that time changed the variable 
to ' t = ? 4^^ where n is the number of "‘degrees of freedom” in- 
volved in estimating cr (the supply standard deviation) from 
Student” (1925) coo|>erated in this extension by preparing tables 
of the probability integral of using n in place of N as the param- 
eter. Since the integrals are of essentially identical curves, and s 
will prove somewhat more adaptable in the present study, we will 
conduct the discussion of the problem in terms of js. All conclu- 
sions reached will apply with equal validity, of course, when t is 
used in place of z. 

“Student” illustrated the usefulness of his z distribution by 
considering the x values as a set of differences (between experi- 
mental and control pairs, say), thus logically making m equal to 
zero. He then found the probability that the resulting a would be 
exceeded solely through random sampling errors. Although it is 
not by any means clear from “Student’s” original memoir that he 
so intended, the custom has grown of considering this probability 
as that which might be expected for the deviation of ^ from m 
if a knowledge of a were available. Is such a transfer of the prob- 
ability really acceptable? The usefulness of the s (or t) test 
depends entirely on the answer to this question. 

SIGNIFICANT DEVIATIONS 

In a supply of variates, x, whose frequency distribution accords 
with the “normal” curve and whose total frequency approaches 
infinity, let the mean be m and the standard deviation 0 , Assume 
a large number of samples, each of total frequency N, to be drawn 
independently and at random from this supply. Let the mean and 
standard deviation of each sample be designated. as Jr and s respec-' 
tively. Then the probability, that values of x will deviate, from m 
by more than a ^ certain, amount' may be determined exactly' from , 
the '“noraar Integ,ral/Lettmg 



326 


^^STUDENTS^^ CRITERION OF DEVIATIONS 


the distributiofi of y will be given by the equation 

(4) dj = e 

a '^iiormar' curve with mean at zero and standard deviation of 
N ^ > Values of y exceeding 1.96/ VTV will arise but 5 times in 
103^ and this value would be known therefore as the “5% level 
of significance/’ For N equal to 5, this level is .8765. 

Let a single sample of S individuals, not known to be drawn 
from the above supply, be made available. It may be desired to 
test whether the mean, of this sample differs sufficiently from 
m to warrant the assumption, on the basis of the mean value alone, 
that the sample has not been drawn from the above supply. If 
( — w)/cr should exceed .8765, those depending on a 5% "level 
of significance” would decide that the sample is significantly dif- 
ferent in the respect tested. However, y will exceed this level 5 
times in 100. It must therefore be expected that up to 5% of 
samples like that designated by the prime above which are investi* 
gated by this procedure will be erroneously segregated as "‘differ- 
ing significantly.” 

This maximum error of 5% is acceptable to most workers for 
two reasons: 

(i) Some such error must be accepted in order to have a basis 
for differentiation, and 5^^? or less (generally less) erroneous seg- 
regation is sufficiently small to be regarded by many as an accept- 
able proportion of error ; 

'(it) The 'Cases erroneously segregated in this manner are the 
most rational ones to be subjected to the error, since they deviate 
from, m by the greatest amount. 

In practi^cal, statistical problems wherein the significance of the 
deviaticm 'of "a'lnean is be, tested, it is usually impossible to 
apply the, above reasoning because of lack of precise knowledge 
oi'theimlue^ol'iy. ,"‘StudentV’ test aimed tO' meet 'this 'defidenq?^ 
/iby: finding 'the z already defined .(equaticms t and'' 2). 



A. E. TRELOAR AND M. A. WILDER 


327 


Applying the probability integral of this variable, he reached his 
conclusions about the significance of s in the same way as has 
been indicated for the variable y. 

THE CORRELATION BETWEEN X AND S . 

In analyzing the adequacy of the procedure suggested by “Stu- 
dent,” it seems fruitful to consider the correlation of F and s. 
Defining the latter in its original sense, 

(5) 5 = \/ 2 . 

“Student” (unknowingly justifying Helmert’s previous work) 
concluded the distribution of j is given by 

( 6 ) ~ -5 <^ 3 . 

This most important equation has not received the discussion 
it deserves. Tables of the probability integral of where 

(7) v'. s/r 

would also be most helpful in small sample analysis, if for no other 
reason than to' show the wide variation which must be expected 
in s for small values of M, An appreciation of this variation is 
much more pertinent to the adequate solution of the problem anal- 
yzed by ‘^Student** than appears to have been realized. We accord- 
ingly include ' here the 2)4% pomts’*^.'in z/ for a few values of M 
small. 



at which thC'^ 
of the curve,- ' 



328 


^STUDENTS^^ CRITERION OF DEP^IATIONS 


It will be seen from these figures that, for N equal to 5, s will 
Tary o¥er the relatively very wide range of Jlo- to 1.49or even 
when only the central 95% of cases are considered. Inasmuch as 
there is no correlation between (x — m) and s when ‘sampling is 
made from a “normaf' supply, the values to be expected for s in 
those samples where (x — m) is the same must vary widely solely 
through the influence of variation in .y. 

Expressing (x — m) ‘and s in terms of cr as the unit of meas- 
urement, the simultaneous distribution we wish to analyze will 
become that of y and v. Since these variables are wholly inde- 
pendent (see Fisher, 1925), their simultaneous distribution will 
be given by the product of their separate probabilities, yielding 

(8) ^ e e. (t v. 

This surface is graphically portrayed in Figure 1 for the case when 
N equals 5. The few contours given are sufficient to indicate the 
general character of the distribution of frequency. Projection of 
the frequencies onto the two margins gives the univariate distribu- 
tions drawn in the Figure. 

If B and B' be taken as the points for the y distribution, 
then lines through them drawn perpendicular to the y axis will cut 
off in the extreme zones of the surface and in the tails of the y 
distribution those samples whose means deviate sufficiently from 
m to permit their segregation according to a **5% level of sig- 
nificance.” 

Since h ^ %/ ^ 

the samples segregated by the 5% level in applying the s test must 
be bounded on one side (in each direction) by radial lines travers- 
ing this surface and passing through the point (y=:0, ?:/ = 0). 
Let he the value of the point for the s distribution. Then 

the cotangent of the angle of incidence to the y axis will in each 
case equal^ ,1.3882 when AT equals 5, 

, Alisamples given by paints in the shaded areas, E and^B'^Fig- 



A. II TRELOAR AMD M. A. WILDER 


329 


lire 1 ) , would be considered significantly deviating with respect to 
i" according to customary interpretation of the z test* Those sam- 
f>les iti t}i€ sliaded areas, F and G, would be segregated by the y 
test. Only those samples in the cross-shaded regions, F, would be 
selected by both tests. For the situation under discussion, wherein 
the sampling is actually made from the one supply, no samples 
really deviate in x from m by an amount not logically to be 
ascribed to random sampling effects. For reasons given earlier in 
this discussion, however, the y segregates are all rationally made. 
Only the z segregates in the double-shaded area' F may be desig- 
nated as rational on the grounds given. Those in the single-shaded 
area E are irrationally selected; the segregation has been made 
because s is small, not because (x m) is large. 


THE CORRELATION BETWEEN X AND . 

An aiialagous geometric view may be presented by considering 
the correlation surface for y and z. To obtain the simultaneous 
distribution of these variables, the substitutions 


may ht made in equation (8), yielding : 


( 9 ) 


' e ■ - z 




In slightly different form, Pearson (1931a) has given this ex-, 
pression and derived frcwn it the equations for the correlation, 
regression and seedasticity of the surface iti terms of JV. He 
detniottstrated that, althot^h r^ession Is rectilin^r and is 


very high, the distribution of a for constant ^ is characterized by 
"excessive lq>totosi8 and extnemie skewness” for JV small, with 



330 ^^STUDBNrS^^ CRITERION OF DEVIATIONS 

It is a simple matter to truncate tlie (y, surface into volumes 
of frequency corresponding to the probability of occurrence of 
given deviates in y or s. This is graphically portrayed in Figure 2^ 
where the surface is approximately represented for iV = 5 and the 
planes of truncation, BCD and bCd, correspond to the 2.5% points, 
B and b respectively, for each variable. Since the frequency sur- 
face is radially symmetrical about the point (y = 0, ;£r = 0), only 
one quadrant need be lettered, 2.5% of the area of the *%ormal^* 
y distribution lies in the minor segment bounded by the ordinate 
AB, and 2.5% of the ""leptokurtic"" z distribution lies in the minor 
segment bounded by the ordinate ab. Also, 2.5% of the total fre- 
quency of the correlation surface lies in the two minor volumes 
truncated by. the vertical .planes passing through AB and ab re- 
spectively. Only that proportion of frequency lying beyond both 
planes, i.e. in the area bCd, exceeds the given level for both varia- 
bles simultaneously. 

The corresponding frequency volumes in Figures 1 and 2 rep- 
resenting segregations by the y and s tests are as follows : 

Figure 1 Figure 2 

Zone E Zone dCD 

Zone F Zone bCD 

Zone G Zone BCb 

That the corresponding zones should not have the same relative 
areas in the two figures is in accordance with expectation, since 
the densities of frequency must vary widely within the zones and 
in different manners from one zone to another. Interpretation of 
the d^ees of rational and irrational segregation by the s test must 
depend upon evaluation of the integrals defining the respective 
frequency volumes. 

EVALUATION OF INTEGRALS 
' For/the' (y, surface, the frequency over each; double^shaded 
zone E will be given by the expression 



A. E. TRELOAR AND M. A. WILDER 


331 


Sr, 


( 10 ) 


J/ 




^ >V ' ^ 


e 


V M-Z 

V <£v. 


3 


For the (y, 2) surface, the corresponding frequency over the 
area, bCD, will be given by the expression 


( 11 ) 


4 = , A 


- ^ 51 




e ^ dz:, 


The constants, ^ and , prove to be identical in magni- 
tude, and we shall therefore give the evaluation of the latter only. 

Integrating from zero to infinity in both' directions, one se- 
cures half the total frequency since the distribution appears equally 
and solely in the two quadrants of positive product. 




0 


But 



/V 

d ^ 


OO 



Therefore 


I 





332 


^STUDENVS^^ CRITERION OF DEVIATIONS 


N-i 

s. 




r 


/V 

N ^ 


and 

( 12 ) 




N 


a, 2 


r~ — *“• i/p 

2 TT 

It is pertinent to prove now' that Afi equals Afs- 


Letting 




^ & 


then 


a/ ^ 

JL^Af T N \/ cL\i =• - ^ 


Substituting in (10), we have, 

.H 


I 




^"2- 


Z 


■TP 


V 


^ v' - 


/V-f 

A' 2: 


- vw' An,? 

W ,ci.W 


Substituting in (11), we have, 
J 6 • Z = 


/V-3 




-vV 


/y'“3 


as-' / 


Thus 

(13) 


A 




N ‘ 




oo 

z ^ 

€ d.^< 




C w , dw* 




^ W d w ^ f" 


3 



A. £. TRELOAR AND M. A. WILDER 


333 


Noting that B equals it would seem logical to conclude 

from the general form of equation (13) that i\f approaches a 
limit of .025 as N increases. We have not yet succeeded in proving 
this explicitly. 

Numerical evaluation of the double integral for A/ presents 
difficulties. These may be overcome by applying a succession of 
reduction formulas to the series of single integrals in powers of 

obtained from the integration with respect to w. For example^ 
when iV = 5, 5 = 0.8765, b = 1.3882, and 



ci vo 


dIT 

Vir 






B e. 

z^fIr av.) 



= .0Z5 -.oolz 


3 ) 


<30 



d. w 


s .ons-.ooi‘4 ^ . 

Values for the frequency volumes A/ (corresponding to the 
area hCD in Figure 2) are givoi as column (4) of Table I for the 
chosen values of V. The differences between these values and .025 
provide the magnitudes of the frequency volumes a>rrespoo<iing to 
BCh and dCD. The latter volumes, which are necessarily equal, 
are given in column (5) of the same table. In coltaims (6) and 
(7) the values in columns (4) and (5) respwtivdy are CJs^res^ 
as percentages of the limiting value, .025. 



334 


^STUDENrr CRITERION OF DEVIATIONS 


We have not succeeded as yet in expressing any of these pro- 
portional frequencies as simple equations in terms of only. In 
Figure 3^ however, a graph of the relationship is plotted, based 
on the data of Table I. The vertical scale on the left gives the 
proportional frequency beyond the two planes passing through C, 
By following the dotted lines to the' scale on the right vertical mar- 
gin, the percentage error (100 dCD/STlS) with which we are con- 
cerned may, be read off directly. 


Table I 


Data for evaluation of volumes truncated by the planes passing 
through C (Fig. 1), for different sizes of sample, where C 
corresponds to the .025 points of y and 


( 1 ) 

( 2 ) 

( 3 ) 

( 4 ) 

( 5 ) 

( 6 ) 

( 7 ) 

N 

B 

b 

Volumes* 

Volumes* as % of .025 


bCD 

BCb=dCL 

bCD 

BCb^dCD 

3 

1.1316 

3.042 

.0064 

.0186 

25.6 

74.4 

5 

.8765 

1.388 

.0104 

.0146 

41.6 

58.4 

7 

.7408 

.999 

.0126 

.0124 

50.4 

49.6 

9 

.6533 

.815 

.0142 

.0108 

56.8 

43.2 

11 

.5910 

.705 

.0151 

.0099 

60.4 

39.6 

13 

.5436 

.629 

.0160 

.0090 

64.0 

36.0 

15 

.5061 

.573 

.0166 

.0084 

66.4 

33.6 

17 

.4754 

.530 

.0171 

.0079 

68.4 

31.6 

19 

.4497 

.495 

.0176 

.0074 

70.4 

29.6 

21 

.4277 

.466 

.0179 

.0071 

71.6 

28.4 

25 

.3920 

.421 

.0185 

.0065 

74.0 

26.0 

29 

.3640 

.387 

.0190 

.0060 

76.0 

24.0 

99 

.1970 

.201 

.0216 

.0034 

86.4 

13.6 


PRACTICAL TESTS 

' in order to test the accuracy of the above deductions when 
applied to a supply wliich is grouped into fairly fine categories, 
two sampling studies were made. Samples of 5 individuals each 
'were drawn in both cases. The first study dealt with a much used 
supply of two anthropometric measures which conform fairly well 
to the ‘^nortiiar* curve in their^ distributions. The second study 

’♦Vplames (nf frequency) follow the notation of Figure Z 




A, B, TRELOAR AND M, A. IVILDER 


335 


used as a supply a theoretical ''normaF' bivariate frequency siir- 
facCj seriated into classes. These studies will be referred to as 
Series I and IL 

Series L From the table provided by MacDoiiell (1902) on 
the associated variation of stature (to the nearest inch) and length 
of the left middle finger (to the nearest millimeter) in 3000 British 
criminals, the measurements were transferred to 3000 numbered 
Denison metal-rim tags from which the cords had been removed. 
After thorough checking and mixing of these circular disks, sam- 
ples of 5 tags each were drawn at random until the supply was 
exhausted. Unfortunately, three of these samples were erroneous- 
ly returned to a receiving box before being copied, and the records 
of 597 samples only are available. For these, the statistics y and ^ 
were calculated for each variable, and frequency surfaces for joint 
occurrence of y and js were prepared in which the statistics for 
stature and finger length were first considered separately, then 
combined. After calculating the correlation coefficient, the fre- 
quencies of the opposite quadrants were added so as to provide 
the seriation without regard to the signs of y and s. The actual 
number of cases falling beyond the planes of truncation corre- 
sponding to the 2.5 fo points were then counted and the propor- 
tional frequencies tabled. 

Series IL From the tables of the probability integral of the 
"normaU correlation surface prepared by Lee and others (see 
Pearson, 1931/>) a correlaiion table of total frequency of 1000 
approximately was prepared; for the case. where, the, correlation, is ^ 
.5. using .3 <r as the unit of classification in both directions. Mod- 
ification of the fractional frequencies' to the nearest whole number; 
yielded a table in which AT equalled 998, r equalled .5003 and the 
two standard deviations equalled .9914 (Sheppard's correction 
applied). Samples ,of '5 were drawn by; working ' systeinatically 
through the tables of random numbers provided by Tippett (1927), 
2043, samples being so secured. These 


the ''Case ' of Series L 



Variable 


Series I 


Series II 


1 

.8744 

.8797 

2 

.7883 

.8869 

1 + 2 

.8270 

.8832 


For Series 11 the agreement with theory is splendid. The wider 
deviations from the theoretical value in Series I are probably due, 
in part, to the leSs perfectly ‘^ormaF’ nature of the supply dis- 
tributions. 

The inadequacy of the correlation coefficient as a descriptive 
measure of such a. ‘‘non-normaF’ surface as that for y and z will 
be apparent at once from an inspection of figure 5, Discordance 
of the two variables increases rapidly as their values increase to 
such an extent that, for N equal to 5, values of z beyond the cus- 
tomary level of significance provide exceedingly poor bases of 
prognostication concerning the true significance of the deviation 
in the mean, /despite the fairly high value of the correlation coeffi- 
cient. 

' In, Table 11 the: frequencies beyond the chosen levels of signifi- 
,cance for y and z, separately and jointly, are given for both series. 
The empirical frequencies are given in Roman type in the whole 
numbers, and, as proportions in parentheses., The theoretical values 
,are given in .italics „in the last column for comparison., The agree- 
, mentis,; very good in,'' every case, the deviation of observed, values 







A, E. TRBLOAR AND M. A. WILDER 337 

from the theoretical being well within the range of error assignable 
to random sampling effects. 

Table II 


Com|)arisoii of actual and theoretical frequencies beyond the giyeii 
levels of significance in the practical tests 


Series 

I 

II 

Theoretical 

Total frequency 

1194 (1) 

4096 (1) 

1 

Frequency beyond 

5% level for 




(a) y alone 

56(.0469) 

206 (.0504) 

.05 

0^) ^ alone 

59 (.0494) 

191 (.0467) 

.05 

(c) y and ^ together 


79(.0193) 

.0208 

Maximum inefficiency 




of test 

62.79b 

58.69b 

58.4fo 


SUMMARY 


‘^Student's’* distribution has been very widely used in the an- 
alysis of small samples in order to determine the probability that 
the deviation of a mean is ascribable to errors of random sampling. 
Most workers appear to have lost sight of the fact that the dis- 
tribution is that of a ratio, in which both the numerator and 
denominator must be expected to vary independently. It is quite 
erroneous to ascribe the probability of such a ratio to the value 
taken by the numerator alone. 

The rationality of segregation according to any given "'level of 
significance'' using "Student's" distribution may be analyzed by 
considering the joint distributions due to errors of sampling in 
the means, standard deviations, and the ratio of these two for 
samples of any given size, N, Theoretical evaluation of the per-^ 
ceiitage of irrationally segregated samples is given herein for the 
odd values of N from 3 to 29 and for N ==:,99,,usiBg^.the'S% level, 
of significance. This percentage falls, in a curvilinear manner"' as 
N increases, a few values being 75% for N = 3,''58;%,: 5,., 

33% for N 15, and ' 14% ' for N ==^'99'. '' The 'So-called "large" 
samples, then, are open to a considerable error of this kind. These 


















338 


^STUDENTS^^ CRITERION OF DEVIATIONS 


results have beeo verified by two extensive sampling tests for 
the case where N = S. 

Results such as those given herein stress again the dangers 
attendant upon the drawing of deductions of practical importance 
from a single sample of small size. When only a single sample is 
available it is certainly desirable that the statistical analysis should 
depend not merely upon most likely estimates of needed parani- 
eterSy but also upon those of less probability which might readily 
be true and which guard against the erroneous segregation of pos- 
sibly insignificant deviations. 

LITERATURE CITED 

Fisher^ R. A. 

1915. Frequency distribution of the values of the correlation coefficient 
in samples from an indefinitely large population. Biometrika 10: 
507-521. 

Fisher, R. A. 

1925. Application of Student’s” distribution. Metron S: 2-32. 
MacDonell, W. R. 

1902. On criminal anthropometry. Biometrika 1 : 177-227. 

“Student” 

1908. The probable error of a mean. Biometrika 6: 1-25. 

“Student” 

1915, Tables for estimating the probability that the mean of a unique 
sample of observations lies between — oo and any given distance 
of the mean of the population from which the sample is drawn, 
Biometrika 11 : 414-417. 

Pearson, Karl 

1931a.. Some properties of “Student’s” s- : Correlation, regression and 
scedasticity of with the mean and standard/ deviation of the sam- 
ple. Biometrika 23 : 1-9. 

'Pearson, Karl 

1931b. Tables for statisticians and biometricians. Fart 11. Cambridge, Uni- 
versity Press, England, pp. ccl -f 262, 

Tiprett, Lw H, C, 

1927, Random sampling , numbers. Cambridge University Press, England, 
pp- viii '4- 26. 

ACKNOWLEDGMENT 

Our thanks are most heartily extended to Professor Dunham 
Jackson of the University of Minnesota for suggesting the anal- 
ysis of the y, V surface as an alternative method of elucidating the 



A. B. TRBLOAR AND M. A. WILDER 


339 


problem, which was first explored in terms of the y, z association ; 
also to Professor Harold Hotelling of Columbia University for 
helpful criticisms of an earlier draft of this paper. Very material 
assistance has also been given by a grant-in-aid from the Rocke- 
feller Foundation through the Graduate School Research Fund of 
the University of Minnesota. 

Figure 1 

Theoretical frequency surface for y and v, separately and jointly, for N = S. 






,340 


.^^STUDENTS'^ CRITERION OF DEVIATIONS 

Figure -2 


Theoretical' fre^ittcncy distribtitions of y md s, separately and jointly^ for 
# = (Contours for the joint distributiofi are approximate only and 
the intervals between them do not correspond to the same incranciit of 

frequency.) 



Curve to, illustrate the increase in correct segregation of means, by the s 
test as N increases. 




A, E. TRELOAR AND M, A. WILDER 


341 


Figure 4 


Frequeacy surface for the joint occurrence of y and v as secured in Series IL 


\i 




(All Rights reserved) 

BIOMETRIKA. VoL XXV!, Parts III and IV 
CONTENTS 


I, The ’Wilkinsoa Head of Oliver Ort«»>well in relation to Fortraitti, BwsiM, liifo awl 

Death Maska By Kabi. Bbamon and Q. M, Mokakt. With 106 PIat^» . , * 260—378 

II. Contribution I r:^fcude de la Th^rie de la OorrtUation. Far OAKtos E. DwytiFAW . 370--4O3 

III. The Use oi Confidence or Fiducial Limits illustrated in the Case of the Uiiiomial* 

By C. J. Cloppbb and E(jow 8. Feabson. With five Diagrams in the Text . . 4O4—410 

IT. The Roumanian Silhouette. By Makioaka Febtia and Otliers. With two Fkkits, 

Map, Diagram, tw Figures in Text and two Contours in Focket . . . , 414—424 

V. On a Kew Method of Determining « Goodness of Fit.” By Kaat Feawon . . 4*25- 442 

VI. A Statistical Study of the Dmwm cmrota tx (Second Article.) By Wir^UAM 

Dowse-l Baten. With eleven Figures in the Text 443—468 

lIjSCmiSLAKIA: 

Review of Faul HarseFs Tahellm fUtt aXU Zmcke in Wismmchq0 und 

Fraads. By F. Garwood . . . . . . /I’ /'i • 460—470 


> The' pabfieation of a paper in ^ xaarke that in the Bditore* it «o«it«ina either in mithod or xnaledal 

sonething of interest to Biontetriciana. But the Editora desire it to h« distinctly understood that ismh {phlioation does 
not mark assent to the arguments used or to the eonolusions drawn in the paper. 

Sk volume of Biometrika eontaiulug about 400 pages, with plates and takes, is issued, annuanj. 

Papers for publication and hooks and ojafprints for notice ehonld be sent to Dr Kaan PnAnaow, Dnimerelty College, 
London. It is a condition of publication in Biomiirika, that the paper shall not already have been isaued tJeewhere, 
and vdll not be reprinted withont leave of the Editora. It is very desdrahle that a copy of all meaBuremente made, 
mot necessarily for publication, should accompany each manneoript. In all caaes the papem tbemselMe ehouM fioataln not 
only the oaloalated oonstants, bat the distributions from which they have been deduced. Diagnams and drawings ahould h# 
aunt in a state snitdble. for direct photographic reproduction, and if on decimal paper it should h« him ruled, and the 
lettering only pencilled. 

Papers will be aceepted in French, Italian or German. .'I^^'thadiwt ease the Manuaoript ahould b« in Eoman not 
Oarman oharaeters. 

Contributors recelto »|i copies of their papers’ free. Joint authors J« copies eadb. Fifty additional eopits may ho had 
on pej^ni of 17/- per sheet of eight jmgea, or part of a sheet of eight pages, with an extra charge for Plaits ; ihttie 
•hould bo ordered when the final proof is returned. 

The snbeeription price, payable in odwacr, is 4d«. mt per volume n aingte issufs M«. net (inoluding poeiage) for Great 
Britain, andi'JiAsv «ei abroad (inoludlng packing and pcefcage). Owing to the acaroifiy of early volumte, 'the following mitm 
must MOW 'be oharfed forcompleto sets. Vole. I—3CX V, Itiolttdiag XX^ i Inland, bound in buckram ig iOd, in wrapper# £'&& m*$ 
ahroad 'iiei24. iSs. ia buckram, fiW4. ISr. in wrappers. Recent volnmoa may stlU be obtained at wrapper prices. ilandArt 
■buckram '©a»w' with, Darwin block, price fie, fid. +6d. poetage per volnme, Indc* to Vol«, I to V, »#. ml Mi«x to Tol*, I 
to XT, 7*. W, ash C-heque# mast im made 'payable to Dr Earl Bearaon and »tnt to The iwrefesay, IMowrfrlka 

Laboratwy, Uulvcreity OoHegc, London, W.O. I, to whom altordefc for icrici and aingla copi#* chOttW hi 
addressed.,', All'ohtunes mnat be pfoyrrlid' stomped and ahoutd be croaaed *"/Mcw»«trtka Jccewunt,**' Ho foreign, nh f frqiftft 
wan be'imupM anleas they are drawn in sterling, properly statti'ped, and payable at a Xiondoti afiin^. 

; , ,0RI«Af »»*yAlN ,Bt WAMPW' MWMI, W.A., A»' fH» timtHilfim 



Indian Agricultural Research Institute (Pusa) 

LIBRARY, NEW DELHM10012 
This book can be issued on or before 


Return Date 


Return Date 




